Is there a way to pass variable as counter to list index in python? - python-3.x

Sorry if i am asking very basic question but i am new to python and need help with below question
I am trying to write a file parser where i am counting number of occurrences(modified programs) mentioned in the file.
I am trying to then store all the occurrences in a empty list and putting counter for each occurrence.
Till here all is fine
Now i am trying to create files based on the names captured in the empty list and store the lines that are not matching between in separate file but i am getting error index out of range as when i am passing el[count] is taking count as string and not taking count's value.
Can some one help
import sys
import re
count =1
j=0
k=0
el=[]
f = open("change_programs.txt", 'w+')
data = open("oct-released_diff.txt",encoding='utf-8',errors='ignore')
for i in data:
if len(i.strip()) > 0 and i.strip().startswith("diff --git"):
count = count + 1
el.append(i)
fl=[]
else:
**filename = "%s.txt" % el[int (count)]**
h = open(filename, 'w+')
fl.append(i)
print(fl, file=h)
el = '\n'.join(el)
print(el, file=f)
print(filename)
data.close()

Related

How to maintain the last index value in python list while doing iteration

My scenario is, lets say I have a
list1= ["a","b","c"] and this list is dynamic (data is getting appended).
My requirement is I need to process the list data each day to eventhub but I should not upload all data each day.I just need to upload the delta.
my approach is
index=0
for i range(len(list1):
## upload
index=index+1
I want to preserve the latest index value, for e.g. in first run index would be 2 and for next run index should be 3 not 0 as per above code. How should I proceed?
I'd simply create a local file file that stores the index. Next time you need it load it and get from there on, assuming you are using the array and the script locally and only have to upload the part of that array.
import os
index = 0
if os.path.isfile("indexFile.txt"):
f = open("indexFile.txt" , "r")
s = f.read()
f.close()
try: index = int(s)
except: index = 0
print("index is: " + str(index))
print("do something with that index")
index += 1
print("store the index back to file")
f = open("indexFile.txt" , "w")
f.write(str(index))
f.close()

Trying to compare two integers in Python

Okay, I have been digging through Stackoverflow and other sites trying understand why this is not working. I created a function to open a csv file. The function opens the file once to count the number of rows then again to actually process the file. What I am attempting to do is this. Once a file has been processed and the record counts match. I will then load the data into a database. The problem is that the record counts are not matching. I checked both variables and they are both 'int', so I do not understand why '==' is not working for me. Here is the function I created:
def mktdata_import(filedir):
'''
This function is used to import market data
'''
files = []
files = filedir.glob('*.csv')
for f in files:
if fnmatch.fnmatch(f,'*NASDAQ*'):
num_rows = 0
nasObj = []
with open(f,mode='r') as nasData:
nasIn = csv.DictReader(nasData, delimiter=',')
recNum = sum(1 for _ in nasData)
with open(f,mode='r') as nasData:
nasIn = csv.DictReader(nasData, delimiter=',')
for record in nasIn:
if (recNum - 1) != num_rows:
num_rows += 1
nasObj.append(record)
elif(recNum - 1) == num_rows:
print('Add records to database')
else:
print('All files have been processed')
print('{} has this many records: {}'.format(f, num_rows))
print(type(recNum))
print(type(num_rows))
else:
print("Not a NASDAQ file!")
(moving comment to answer)
nasData includes all the rows in the file, including the header row. When converting the data to dictionaries with DictReader, only the data rows are processed so len(nasData) will always be one more than len(nasIn)
As the OP mentioned, iterating the elements did not work so using the line number was required to get the script working: (recNum) == nasIn.line_num

Read out .csv and hand results to a dictionary

I am learning some coding, and I am stuck with an error I can't explain. Basically I want to read out a .csv file with birth statistics from the US to figure out the most popular name in the time recorded.
My code looks like this:
# 0:Id, 1: Name, 2: Year, 3: Gender, 4: State, 5: Count
names = {} # initialise dict names
maximum = 0 # store for maximum
l = []
with open("Filepath", "r") as file:
for line in file:
l = line.strip().split(",")
try:
name = l[1]
if name in names:
names[name] = int(names[name]) + int(l(5))
else:
names[name] = int(l(5))
except:
continue
print(names)
max(names)
def max(values):
for i in values:
if names[i] > maximum:
names[i] = maximum
else:
continue
return(maximum)
print(maximum)
It seems like the dictionary does not take any values at all since the print command does not return anything. Where did I go wrong (incidentally, the filepath is correct, it takes a while to get the result since the .csv is quite big. So my assumption is that I somehow made a mistake writing into the dictionary, but I was staring at the code for a while now and I don't see it!)
A few suggestions to improve your code:
names = {} # initialise dict names
maximum = 0 # store for maximum
with open("Filepath", "r") as file:
for line in file:
l = line.strip().split(",")
names[name] = names.get(name, 0) + l[5]
maximum = [(v,k) for k,v in names]
maximum.sort(reversed=True)
print(maximum[0])
You will want to look into Python dictionaries and learn about get. It helps you accomplish the objective of making your names dictionary in less lines of codes (more Pythonic).
Also, you used def to generate a function but you never called that function. That is why it's not printing.
I propose the shorted code above. Ask if you have questions!
Figured it out.
I think there were a few flow issues: I called a function before defining it... is that an issue or is python okay with that?
Also I think I used max as a name for a variable, but there is a built-in function with the same name, that might cause an issue I guess?! Same with value
This is my final code:
names = {} # initialise dict names
l = []
def maxval(val):
maxname = max(val.items(), key=lambda x : x[1])
return maxname
with open("filepath", "r") as file:
for line in file:
l = line.strip().split(",")
name = l[1]
try:
names[name] = names.get(name, 0) + int(l[5])
except:
continue
#print(str(l))
#print(names)
print(maxval(names))

Split the file and further split specific index

I have a big configuration file which needs to be splitted per hierarchy based on specific syntax; completed. Among that on a specific string match I am pulling specific index and that index needs to be splitted further based on regex match"\s{8}exit" which is not working
import re
def config_readFile(filename):
F=open(filename,"r")
s=F.read()
F.close()
return re.split("#\-{50}",s)
return s
C = config_readFile('bng_config_input.txt')
print ('printing C:', C)
print (C.__len__())
vAAA_FIXED = []
for i in C:
if "CRS-1-SUB1" in i:
vAAA_FIXED.append(i)
# print (vAAA_FIXED)
print (vAAA_FIXED.__len__())
print(vAAA_FIXED)
vAAA_FIXED = vAAA_FIXED.split(" ")
print (vAAA_FIXED.__len__())
Need to get new list from the original list

Delta words between two TXT files

I would like to count the delta words between two files.
file_1.txt has content One file with some text and words..
file_1.txt has content One file with some text and additional words to be found..
diff command on Unix systems gives the following infos. difflib can give a similar output.
$ diff file_1.txt file_2.txt
1c1
< One file with some text and words.
---
> One file with some text and additional words to be found.
Is there an easy way to found the words added or removed between two files, or at least between two lines as git diff --word-diff does.
First of all you need to read your files into strings with open() where 'file_1.txt' is path to your file and 'r' is for "reading mode".
Similar for the second file. And don't forget to close() your files when you're done!
Use split(' ') function to split strings you have just read into lists of words.
file_1 = open('file_1.txt', 'r')
text_1 = file_1.read().split(' ')
file_1.close()
file_2 = open('file_2.txt', 'r')
text_2 = file_2.read().split(' ')
file_2.close()
Next step you need to get difference between text_1 and text_2 list variables (objects).
There are many ways to do it.
1)
You can use Counter class from collections library.
Pass your lists to the class's constructor, then find the difference by subtraction in straight and reverse order, call elements() method to get elements and list() to transform it to the list type.
from collections import Counter
text_count_1 = Counter(text_1)
text_count_2 = Counter(text_2)
difference = list((text_count_1 - text_count_2).elements()) + list((text_count_2 - text_count_1).elements())
Here is the way to calculate the delta words.
from collections import Counter
text_count_1 = Counter(text_1)
text_count_2 = Counter(text_2)
delta = len(list((text_count_2 - text_count_1).elements())) \
- len(list((text_count_1 - text_count_2).elements()))
print(delta)
2)
Use Differ class from difflib library. Pass both lists to compare() method of Differ class and then iterate it with for.
from difflib import Differ
difference = []
for d in Differ().compare(text_1, text_2):
difference.append(d)
Then you can count the delta words like this.
from difflib import Differ
delta = 0
for d in Differ().compare(text_1, text_2):
status = d[0]
if status == "+":
delta += 1
elif status == "-":
delta -= 1
print(delta)
3)
You can write difference method by yourself. For example:
def get_diff (list_1, list_2):
d = []
for item in list_1:
if item not in list_2:
d.append(item)
return d
difference = get_diff(text_1, text_2) + get_diff(text_2, text_1)
I think that there are other ways to do this. But I will limit by three.
Since you get the difference list you can manage the output like whatever you wish.
..and here is yet another way to do this with dict()
#!/usr/bin/python
import sys
def loadfile(filename):
h=dict()
f=open(filename)
for line in f.readlines():
words=line.split(' ')
for word in words:
h[word.strip()]=1
return h
first=loadfile(sys.argv[1])
second=loadfile(sys.argv[2])
print "in both first and second"
for k in first.keys():
if k and k in second.keys():
print k

Resources