xml地图|网站地图|网站标签 [设为首页] [加入收藏]
自学课程,创建工作集并将element加入工作集中
分类:编程

话不多说,直接上代码!

书接上文,前文最后提到将爬取的电影信息写入数据库,以方便查看,今天就具体实现。

  5章 字符与字符串

 public class WorkSetHelper
 {
        public void AddElementsToWorkSet(Document doc, List<Element> elements)
        {
            if (doc.IsWorkshared == true)
            {
                var workset = GetWorkset(doc);
                if (workset != null)
                {
                    var worksetID = workset.Id.IntegerValue;
                    using (Transaction tran = new Transaction(doc, "[ToolSet] Add Elemens To WorkSet"))
                    {
                        tran.Start();
                        foreach (var ele in elements)
                        {
                            Parameter wsparam = ele.get_Parameter(BuiltInParameter.ELEM_PARTITION_PARAM);
                            if (wsparam != null)
                            {
                                wsparam.Set(worksetID);
                            }                          
                        }
                        tran.Commit();
                    }                   
                }               
            }
        }

        public Workset GetWorkset(Document doc)
        {
            Workset newWorkset = null;
            // Worksets can only be created in a document with worksharing enabled
            if (doc.IsWorkshared)
            {
                string worksetName = "WorkSetName";
                // Workset name must not be in use by another workset
                if (WorksetTable.IsWorksetNameUnique(doc, worksetName))
                {
                    using (Transaction tran = new Transaction(doc, "[ToolSet] Create Work Set For ToolSet"))
                    {
                        tran.Start();
                        newWorkset = Workset.Create(doc, worksetName);
                        tran.Commit();
                    }
                }
                else
                {
                    IList<Workset> worksetList = new FilteredWorksetCollector(doc).OfKind(WorksetKind.UserWorkset).ToWorksets();
                    foreach (Workset workset in worksetList)
                    {
                        if (workset.Name.Contains(worksetName))
                        {
                            return workset;
                        }
                    }
                }
            }
            return newWorkset;
        }
 }

首先还是上代码:

  1.字符类char的使用

 

# -*- coding:utf-8 -*-
import requests
import re
import mysql.connector

#changepage用来产生不同页数的链接
def changepage(url,total_page):
    page_group = ['https://www.dygod.net/html/gndy/jddy/index.html']
    for i in range(2,total_page+1):
        link = re.sub('jddy/index','jddy/index_'+str(i),url,re.S)
        page_group.append(link)
    return page_group
#pagelink用来产生页面内的视频链接页面
def pagelink(url):
    base_url = 'https://www.dygod.net/html/gndy/jddy/'
    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}
    req = requests.get(url , headers = headers)
    req.encoding = 'gbk'#指定编码,否则会乱码
    pat = re.compile('<a href="/html/gndy/jddy/(.*?)" class="ulink" title=(.*?)/a>',re.S)#获取电影列表网址
    reslist = re.findall(pat, req.text)

    finalurl = []
    for i in range(1,25):
        xurl = reslist[i][0]
        finalurl.append(base_url + xurl)
    return finalurl #返回该页面内所有的视频网页地址

#getdownurl获取页面的视频地址和信息
def getdownurl(url):
    headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'}
    req = requests.get(url , headers = headers)
    req.encoding = 'gbk'#指定编码,否则会乱码

    pat = re.compile('<a href="ftp(.*?)">ftp',re.S)#获取下载地址
    reslist = re.findall(pat, req.text)
    furl = 'ftp'+reslist[0]

    pat2 = re.compile('<!--Content Start-->(.*?)<!--duguPlayList Start-->',re.S)#获取影片信息
    reslist2 = re.findall(pat2, req.text)
    reslist3 = re.sub('[<p></p>]','',reslist2[0])
    fdetail = reslist3.split('◎')

    return (furl,fdetail)

#创建表movies
def createtable(con,cs):
    #创建movies表,确定其表结构:
    cs.execute('create table if not exists movies (film_addr varchar(1000), cover_pic varchar(1000), name varchar(100) primary key,
     ori_name varchar(100),prod_year varchar(100), prod_country varchar(100), category varchar(100), language varchar(100), 
     subtitle varchar(100), release_date varchar(100), score varchar(100), file_format varchar(100), video_size varchar(100), 
     file_size varchar(100), film_length varchar(100), director varchar(100), actors varchar(500), profile varchar(2000),capt_pic varchar(1000))')
    # 提交事务:
    con.commit()

#将电影地址和简介插入表中
def inserttable(con,cs,x,y):
    try:
        cs.execute('insert into movies values (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)',
    (x,y[0],y[1],y[2],y[3],y[4],y[5],y[6],y[7],y[8],y[9],y[10],y[11],y[12],y[13],y[14],y[15],y[16],y[17]))
    except:
        pass
    finally:
        con.commit()

if __name__ == "__main__" :
    html = "https://www.dygod.net/html/gndy/jddy/index.html"
    print('你即将爬取的网站是:https://www.dygod.net/html/gndy/jddy/index.html')
    pages = input('请输入需要爬取的页数:')
    createtable
    p1 = changepage(html,int(pages))

    #打开数据库
    conn = mysql.connector.connect(user='py', password='Unix_1234', database='py_test')
    cursor = conn.cursor()
    createtable(conn,cursor)
    #插入数据
    j = 0
    for p1i in p1 :
        j = j + 1
        print('正在爬取第%d页,网址是 %s ...'%(j,p1i))
        p2 = pagelink(p1i)
        for p2i in p2 :
            p3,p4 = getdownurl(p2i)
            if len(p3) == 0 :
                pass
            else :
                inserttable(conn,cursor,p3,p4)
    #关闭数据库
    cursor.close()
    conn.close()
    print('所有页面地址爬取完毕!')

  2.转义字符的使用

结尾:

用到的知识点和前面比,最重要是多了数据库的操作,下面简要介绍下python如何连接数据库。

  3.字符串类string的使用

       进入设计院两周,发现市面上的Revit插件与设计院的需求差距还是很大的,设计院对视图的显示方面的要求较高。一个人在中心从事Revit插件开发压力山大啊,告诉自己慢慢来,能赢!

一、python中使用mysql需要驱动,常用的有官方的mysql-connect-python,还有mysqldb(Python 2.x)和pymysql(Python 3.x),这几个模块既是驱动,又是工具,可以用来直接操作mysql数据库,也就是说它们是通过在Python中写sql语句来操作的,例如创建user表:

  4.比较字符串

cursor.execute('create table user (id int, name varchar(20))')

  5.格式化字符串

#这里的create table语句就是典型的sql语句。

  6.截取,分割字符串

本文由澳门新葡亰手机版发布于编程,转载请注明出处:自学课程,创建工作集并将element加入工作集中

上一篇:没有了 下一篇:TDD的一点想法和实践,python数据库编程
猜你喜欢
热门排行
精彩图文