Looking for direction
Dave Angel
davea at davea.name
Wed May 13 21:12:53 EDT 2015
More information about the Python-list mailing list
Wed May 13 21:12:53 EDT 2015
- Previous message (by thread): Looking for direction
- Next message (by thread): Looking for direction
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 05/13/2015 08:45 PM, 20/20 Lab wrote:> You accidentally replied to me, rather than the mailing list. Please use reply-list, or if your mailer can't handle that, do a Reply-All, and remove the parts you don't want. > > On 05/13/2015 05:07 PM, Dave Angel wrote: >> On 05/13/2015 07:24 PM, 20/20 Lab wrote: >>> I'm a beginner to python. Reading here and there. Written a couple of >>> short and simple programs to make life easier around the office. >>> >> Welcome to Python, and to this mailing list. >> >>> That being said, I'm not even sure what I need to ask for. I've never >>> worked with external data before. >>> >>> I have a LARGE csv file that I need to process. 110+ columns, 72k >>> rows. >> >> That's not very large at all. >> > In the grand scheme, I guess not. However I'm currently doing this > whole process using office. So it can be a bit daunting. I'm not familiar with the "office" operating system. >>> I managed to write enough to reduce it to a few hundred rows, and >>> the five columns I'm interested in. >> >>> >>> Now is were I have my problem: >>> >>> myList = [ [123, "XXX", "Item", "Qty", "Noise"], >>> [72976, "YYY", "Item", "Qty", "Noise"], >>> [123, "XXX" "ItemTypo", "Qty", "Noise"] ] >>> >> >> It'd probably be useful to identify names for your columns, even if >> it's just in a comment. Guessing from the paragraph below, I figure >> the first two columns are "account" & "staff" > > The columns that I pull are Account, Staff, Item Sold, Quantity sold, > and notes about the sale (notes arent particularly needed, but the > higher ups would like them in the report) >> >>> Basically, I need to check for rows with duplicate accounts row[0] and >>> staff (row[1]), and if so, remove that row, and add it's Qty to the >>> original row. >> >> And which column is that supposed to be? Shouldn't there be a number >> there, rather than a string? >> >>> I really dont have a clue how to go about this. The >>> number of rows change based on which run it is, so I couldnt even get >>> away with using hundreds of compare loops. >>> >>> If someone could point me to some documentation on the functions I would >>> need, or a tutorial it would be a great help. >>> >> >> Is the order significant? Do you have to preserve the order that the >> accounts appear? I'll assume not. >> >> Have you studied dictionaries? Seems to me the way to handle the >> problem is to read in a row, create a dictionary with key of (account, >> staff), and data of the rest of the line. >> >> Each time you read a row, you check if the key is already in the >> dictionary. If not, add it. If it's already there, merge the data as >> you say. >> >> Then when you're done, turn the dict back into a list of lists. >> > The order is irrelevant. No, I've not really studied dictionaries, but > a few people have mentioned it. I'll have to read up on them and, more > importantly, their applications. Seems that they are more versatile > then I thought. > > Thank you. You have to realize that a tuple can be used as a key, in your case a tuple of Account and Staff. You'll have to decide how you're going to merge the ItemSold, QuantitySold, and notes. -- DaveA -- DaveA
- Previous message (by thread): Looking for direction
- Next message (by thread): Looking for direction
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list