How to compare lines in data (.csv) file based on two values, then roll-up the data using Python? -
i have .csv file has 25 columns. in data, column 18 people_id, , column 19 donation date. have pre-sorted data using linux people id's appear together, sorted donation date in descending order.
here i'm not sure how proceed. need find lines have same people_id , donation date, sum various values, , output single line output. essentially, every line in file either different customer, or different donation date same customer. best use dictionary using people_id key? how syntactically?
i thinking this:
with open("file.csv") csv_file: row in csv.reader(csv_file, delimiter=','): if row[18] in data_dict: # something
i'd recommend object-oriented approach.
import csv class transaction: def __init__(self, fields): self.name, self.age, self.car, self.ident = fields # whatever fields have # keep in mind these strings, # may need process them before analysis def calculation(self): return self.age + self.id transactions = {} open('csv_file.csv', newline='') f: row in csv.reader(f): bucket = tuple(row[18:20]) if bucket in transactions: transactions[bucket].append(transaction(row)) else: transactions[bucket] = [transaction(row)] bucket in transactions: print(bucket, sum(item.amount item in bucket.values()))
this defines transaction
class, instances of contain various fields come csv file. starts dictionary of transactions , looks through csv file, adding new transaction
object new bucket (if given id , date haven't been seen before) or existing bucket (if given id , date have been seen before).
then goes through dictionary , performs calculation each bucket, printing bucket , result of calculation.
Comments
Post a Comment