利用pymysql与Pandas读取远程数据库上的表格,并存储为本地csv文件。
因为业务需要读取服务器上的数据进行分析,能连接到sql服务器,但是当我企图用mysqldump备份(复制)database到本地却经常出现mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table
错误,这个需要改net_write_timeout
来增加等待时长来解决,苦于我的账号并没有权限,难不成只好之间上sql用odbs大法直接做数据分析?我的天,那得多麻烦,经过几番思考,我的解决方案是:用sql的python接口包pymysql进行sql登录以及操作,再用Pandas以DataFrame格式接受table数据,最后存为csv文件。
登录模块 注意,若数据库含有中文,需要use_unicode=True
避免乱码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import pymysqlhost = '#########' port = user = '########' password = '#######' def get_database (): """ :return: return the pre-connect database """ connection = pymysql.connect(host=host, user=user, password=password, port=port, charset='utf8mb4' , use_unicode=True ,) return connection
处理模块 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 from connection import get_databaseimport pandas as pddef execute (sql_command ): try : with connection.cursor() as cursor: cursor.execute(sql_command) result = cursor.fetchall() print ("COMMAND: " , sql_command) print ("RESULT: " , result) print () dic[sql_command] = result except : print ("Command %s failed!" % sql_command) print () connection = get_database() connection.commit() dic = {} sql_commands = [ "use wind_data;" ,] for sql_command in sql_commands: execute(sql_command) tables = [] l = list (dic["show tables;" ]) for i in l: tables.append(i[0 ]) for table in tables: sql_cmd = "select * from %s" % table df = pd.read_sql(sql=sql_cmd, con=connection) df.to_csv("data/%s.csv" % table) connection.close()
参考自