python - Combine multiple csv files -
i using q
transform csv file: log.csv (file linked). it's format is:
datapath,port,rxpkts,rxbytes,rxerror,txpkts,txbytes,txerror 4,1,178,25159,0,40,3148,0 4,2,3,230,0,213,27897,0 4,3,3,230,0,212,27807,0 4,4,4,320,0,211,27717,0 4,5,3,230,0,212,27807,0 4,6,3,230,0,212,27807,0 4,7,4,320,0,211,27717,0 4,8,4,320,0,211,27717,0 4,9,4,320,0,211,27717,0 4,a,4,320,0,211,27717,0 4,b,3,230,0,212,27807,0 4,fffffffe,7,578,0,209,27549,0 3,1,197,26863,0,21,1638,0 3,2,3,230,0,215,28271,0 3,3,5,390,0,215,28271,0 3,4,2,140,0,216,28361,0 3,5,4,320,0,214,28181,0 3,6,3,230,0,215,28271,0 3,fffffffe,7,578,0,212,28013,0 5,1,208,27401,0,6,488,0 5,fffffffe,7,578,0,208,27401,0 2,1,180,24228,0,18,1368,0 2,2,2,140,0,195,25366,0 2,3,2,140,0,195,25366,0 2,4,3,230,0,194,25276,0 2,5,3,230,0,194,25276,0 2,6,2,140,0,195,25366,0 2,fffffffe,7,578,0,191,25018,0 1,1,38,5096,0,182,23602,0 1,2,42,5419,0,179,23369,0 1,3,61,7152,0,159,21546,0 1,4,28,4611,0,192,24087,0 1,5,46,6022,0,174,22676,0 1,fffffffe,7,578,0,214,28210,0
i want covert format:
the number of ports can vary.
current code:
python q -h -d "," "select rxpkts, txpkts ./log.csv datapath = , port = j" > i_j.csv;
so made i*j
number of files , combined them manually. there way in 1 go modifying above sql query or combine files using python or using pandas suggested in comments?
import subprocess def printit(): in range(1,6): j in range(1,6): query = "select rxpkts, txpkts ./log.csv datapath = "+str(i)+" , port = "+str(j) filename = str(i)+"_"+str(j)+".csv" open(filename, "w+") f: p = subprocess.popen(["python", "q", "-h", "-d", ",", query], stdout=f) printit()
you can use set_index
stack
.
import pandas pd # data # ====================================== print(df) datapath port rxpkts ... txpkts txbytes txerror 0 4 1 178 ... 40 3148 0 1 4 2 3 ... 213 27897 0 2 4 3 3 ... 212 27807 0 3 4 4 4 ... 211 27717 0 4 4 5 3 ... 212 27807 0 5 4 6 3 ... 212 27807 0 6 4 7 4 ... 211 27717 0 7 4 8 4 ... 211 27717 0 8 4 9 4 ... 211 27717 0 9 4 4 ... 211 27717 0 .. ... ... ... ... ... ... ... 24 2 4 3 ... 194 25276 0 25 2 5 3 ... 194 25276 0 26 2 6 2 ... 195 25366 0 27 2 fffffffe 7 ... 191 25018 0 28 1 1 38 ... 182 23602 0 29 1 2 42 ... 179 23369 0 30 1 3 61 ... 159 21546 0 31 1 4 28 ... 192 24087 0 32 1 5 46 ... 174 22676 0 33 1 fffffffe 7 ... 214 28210 0 [34 rows x 8 columns] # reshaping # ====================================== series_res = df[df.columns[:4]].set_index(['datapath', 'port']).stack() series_res.name = 'value' datapath port 4 1 rxpkts 178 rxbytes 25159 2 rxpkts 3 rxbytes 230 3 rxpkts 3 rxbytes 230 4 rxpkts 4 rxbytes 320 5 rxpkts 3 rxbytes 230 ... 1 2 rxpkts 42 rxbytes 5419 3 rxpkts 61 rxbytes 7152 4 rxpkts 28 rxbytes 4611 5 rxpkts 46 rxbytes 6022 fffffffe rxpkts 7 rxbytes 578 name: value, dtype: int64 df_res = pd.dataframe(series_res) df_res.t datapath 4 ... 1 port 1 2 3 ... 4 5 fffffffe rxpkts rxbytes rxpkts rxbytes rxpkts rxbytes ... rxpkts rxbytes rxpkts rxbytes rxpkts rxbytes value 178 25159 3 230 3 230 ... 28 4611 46 6022 7 578 [1 rows x 68 columns]
Comments
Post a Comment