Python3 print selected values of dict - python-3.x

In this simple code to read a tsv file of many columes:
InColnames = ['Chr','Pos','Ref','Alt']
tsvin = csv.DictReader(fin, delimiter='\t')
for row in tsvin:
print(', '.join(row[InColnames]))
How can I make the print work ?

The following will do:
for row in tsvin:
print(', '.join(row[col] for col in InCOlNames))
You cannot pass a list of keys to the dict's item-lookup and magically get a list of values. You have to somehow iterate the keys and retrieve each one's value individually. The approach at hand uses a generator expression for that.

Related

I'm trying to remove certain words from a column on each row in my dataframe

I'm still trying to understand how pandas works please bear with me. In this exercise, I,m trying to access a particular column ['Without Stop Words'] on each row which has a list of words. I wish to remove certain words from each row of that column. the words to be removed have been specified in a dictionary called {'stop_words_dict'}. here's my code, but the dataframe seems to be unchanged after running it.
def stop_words_remover(df):
# your code here
df['Without Stop Words']= df['Tweets'].str.lower().str.split()
for i, r in df.iterrows():
for word in df['Without Stop Words']:
if word in stop_words_dict.items():
df['Without Stop Words'][i] = df['Without Stop Words'].str.remove(word)
return df
this is how the input looks like
INPUT
EXPECTED OUTPUT
In Pandas, it's generally a bad idea to loop over your dataframe row by row to try to change rows. Instead, try using methods like .apply().
An example for stopwords, together with list comprehension:
test['Tweets'].apply(lambda x: [item for item in x if item not in stop_words_dict.items()])
See https://stackoverflow.com/a/29523440/12904151 for more context.

Iterating thru a not so ordinary Dictionary in python 3.x

Maybe it is ordinary issue regarding iterating thru a dict. Please find below imovel.txt file, whose content is as follows:
{'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
As you can see this is not a ordinary dictionary, with a key value pair; but a key with a list as key and another list as value
My code is:
#/usr/bin/python
def load_dict_from_file():
f = open('../txt/imovel.txt','r')
data=f.read()
f.close()
return eval(data)
thisdict = load_dict_from_file()
for key,value in thisdict.items():
print(value)
and yields :
['primeiro', 'segundo', 'terceiro'] ['101', '201', '301']
I would like to print a key,value pair like
{'primeiro':'101, 'segundo':'201', 'terceiro':'301'}
Given such txt file above, is it possible?
You should use the builtin json module to parse but either way, you'll still have the same structure.
There are a few things you can do.
If you know both of the base key names('Andar' and 'Apto') you can do it as a one line dict comprehension by zipping the values together.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# One line dict comprehension
newdict = {key: value for key, value in zip(thisdict['Andar'], thisdict['Apto'])}
print(newdict)
If you don't know the names of the keys, you could call next on an iterator assuming they're the first 2 lists in your structure.
# what you'll get from the file
thisdict = {'Andar': ['primeiro', 'segundo', 'terceiro'], 'Apto': ['101','201','301']}
# create an iterator of the values since the keys are meaningless here
iterator = iter(thisdict.values())
# the first group of values are the keys
keys = next(iterator, None)
# and the second are the values
values = next(iterator, None)
# zip them together and have dict do the work for you
newdict = dict(zip(keys, values))
print(newdict)
As other folks have noted, that looks like JSON, and it'd probably be easier to parse it read through it as such. But if that's not an option for some reason, you can look through your dictionary this way if all of your lists at each key are the same length:
for i, res in enumerate(dict[list(dict)[0]]):
ith_values = [elem[i] for elem in dict.values()]
print(ith_values)
If they're all different lengths, then you'll need to put some logic to check for that and print a blank or do some error handling for looking past the end of the list.

How to merge lists from a loop in jupyter?

I want to determine the rows in a data frame that has the same value in some special columns (sex, work class, education).
new_row_data=df.head(20)
new_center_clusters =new_row_data.head(20)
for j in range(len(new_center_clusters)):
row=[]
for i in range(len(new_row_data)):
if (new_center_clusters.iloc[j][5] == new_row_data.iloc[i][5]):
if(new_center_clusters.iloc[j][2] == new_row_data.iloc[i][2]):
if(new_center_clusters.iloc[j][3] == new_row_data.iloc[i][3]):
if(new_center_clusters.iloc[j][0] != new_center_clusters.iloc[i][0]):
row.append(new_center_clusters.iloc[j][0])
row.append(new_center_clusters.iloc[i][0])
myset = list(set(row))
myset.sort()
print(myset)
I need a list that includes all the IDs of similar rows in one list. but I can not merge all the lists in one list.
I get this result:
I need to get like this:
[1,12,8,17,3,18,4,19,5,13,6,9]
Thank you in advance.
if you want combine all list
a=[1,3,4]
b=[2,4,1]
a.extend(b)
it will give output as:
[1,3,4,2,4,1]
similary if you want to remove the duplicates, convert it into set and again list:
c=list(set(a))
it will give output as:
[1,3,4,2]

Appending values to dictionary/list

I have a mylist = [[a,b,c,d],...[]] with 650 lists inside. I am trying to insert this into a relational database with dictionaries. I have the following code:
for i in mylist:
if len(i) == 4:
cve_ent = {'state':[], 'muni':[], 'area':[]}
cve_ent['state'].append(i[1])
cve_ent['muni'].append(i[2])
cve_ent['area'].append(i[3])
However this code just yields the last list in mylist in the dictionary. I have tried also with a counter and a while loop but I cannot make it run.
I do not know if this is the fastest way to store the data, what I will do is compare the values of the first and second keys with other tables to multiply the values of the third key.
First of all, pull
cve_ent = {'state':[], 'muni':[], 'area':[]}
out of your for loop. That will solve issues with re-writing things.

Python3 - CSV - Add Rows with namedtuple

Trying to make a method to add new rows following the interface bellow:
def row_add(self, **rowtoadd)
I don't see how, if I define my columns like:
stuff1, stuff2, stuff3
I can get a **namedtuple to sort itself in the correct order of columns names.
So far, I've tried (here table = the filepath we're editing, containing the csv we need):
def row_add(self, **rowtoadd):
if os.path.isfile(self.table):
with open(self.table, 'a') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(rowtoadd)
But the namedtuple is not converted into a row, only the name of variable are.
ex:
row_add(stuff1="hello1", stuff2="hello2", stuff3="hello3")
cat ./my_file.csv -> stuff1, stuff2, stuff3
Try the following:
csvwriter.writerow(rowtoadd[x] for x in sorted(rowtoadd.keys()))
The issue is two-fold:
'rowtoadd' is a dict object. The order of the keys of a dict is not upheld in python.
When you writerow(rowtoadd), the default iterator in a dict is over the keys, which is why your csv file is getting the keys rather than the values.
In my line of code above, sorted(rowtoadd.keys()) sorts the keys of the dict, so that they are in a predictable order (alphabetical). rowtoadd[x] for x in ... makes it a comprehension which provides an ordered list of the values you'd like to print into the file.
A key thing to understand here is that the csvwriter is not aware of the files preexisting structure. It doesn't know what order the keys should be in. You need to specify that order somehow. In this case, I specified the order alphabetically, but you may need to do it differently.
If you don't know the names of the fields beforehand, you could use positional arguments to keep the order of the fields. Positional arguments become a tuple, which is an ordered type in python:
def row_add(self, *row):
if os.path.isfile(self.table):
with open(self.table, 'a') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(row)
This solution relies on the fact that the caller provides the arguments in the correct order.

Resources