How to swap string and save - python-3.x

I am having problems saving the file i modifyed basicly i need to replace in original file string called DTC_5814_removing and switch_data and save it as a seperate file how would i do that, so here is what program basicly does, it opens eeprom file, then searches for a string between two strings and groups it, then counts the data and by that given data searches for other string that is between two strings and modyfies data,basicly the code works i have a question how is the best way to save that as a seperate file, filesave function currently has no functin
here is the code:
import re
#checking the structures counting
file = open ("eeprom", "rb") .read().hex()
filesave = open("eepromMOD", "wb")
DTC_data = re.search("ffff30(.*)100077", file)
DTC_data_final = print (DTC_data.group(1))
#finds string between two strings in 2nd line of eeprom file
switch_data = re.search("010607(.*)313132", file)
switch_data_final = print (switch_data.group(1))
#finds string betwenn two strings in 3rd line of eeprom file
DTC_data_lenght = (len(DTC_data.group(1)))
#lenght of the whole DTC_data group
DTC_312D = re.search("ffff30(.*)312d", file)
DTC_3036 = re.search("ffff30(.*)3036", file)
DTC_5814 = re.search("ffff30(.*)5814", file)
#searching for DTC 312D
DTC_312D_lenght = (len(DTC_312D.group(1))+4)
DTC_312D_lenght_start =(len(DTC_312D.group(1)))
DTC_3036_lenght = (len(DTC_3036.group(1))+4)
DTC_3036_lenght_start =(len(DTC_3036.group(1)))
DTC_5814_lenght = (len(DTC_5814.group(1))+4)
DTC_5814_lenght_start =(len(DTC_5814.group(1)))
#confirming the lenght of the DTC table
if DTC_312D_lenght <= DTC_data_lenght and DTC_312D_lenght%4==0 :
#If dtc lenght shorter than whole table and devidable by 4
print("Starting DTC removal")
#Printing for good count
switch_data_lenght = (len(switch_data.group(1)))
#Counting switch data table
DTC_312D_removing = switch_data.group(1)[:DTC_312D_lenght_start] + "0000" + switch_data.group(1)[DTC_312D_lenght:]
#Read from data group (data[:define start] + "mod to wish value" + data[define end]
print(DTC_312D_removing)
else:
print("DTC non existant or incorrect")
if DTC_3036_lenght <= DTC_data_lenght and DTC_3036_lenght%4==0 :
#If dtc lenght shorter than whole table and devidable by 4
print("Starting DTC removal")
#Printing for good count
switch_data_lenght = (len(switch_data.group(1)))
#Counting switch data table
DTC_3036_removing = DTC_312D_removing[:DTC_3036_lenght_start] + "0000" + switch_data.group(1)[DTC_3036_lenght:]
#Read from data group (data[:define start] + "mod to wish value" + data[define end]
print(DTC_3036_removing)
else:
print("DTC non existant or incorrect")
if DTC_5814_lenght <= DTC_data_lenght and DTC_5814_lenght%4==0 :
#If dtc lenght shorter than whole table and devidable by 4
print("Starting DTC removal")
#Printing for good count
switch_data_lenght = (len(switch_data.group(1)))
#Counting switch data table
DTC_5814_removing = DTC_3036_removing[:DTC_5814_lenght_start] + "0000" + switch_data.group(1)[DTC_5814_lenght:]
#Read from data group (data[:define start] + "mod to wish value" + data[define end]
print(DTC_5814_removing)
else:
print("DTC non existant or incorrect")

Solved with
File_W = file.replace(switch_data.group(1), DTC_5814_removing)
File_WH = binascii.unhexlify(File_W)
filesave.write(File_WH)
filesave.close()

Related

How to handle blank line,junk line and \n while converting an input file to csv file

Below is the sample data in input file. I need to process this file and turn it into a csv file. With some help, I was able to convert it to csv file. However not fully converted to csv since I am not able to handle \n, junk line(2nd line) and blank line(4th line). Also, i need help to filter transaction_type i.e., avoid "rewrite" transaction_type
{"transaction_type": "new", "policynum": 4994949}
44uu094u4
{"transaction_type": "renewal", "policynum": 3848848,"reason": "Impressed with \n the Service"}
{"transaction_type": "cancel", "policynum": 49494949, "cancel_table":[{"cancel_cd": "AU"}, {"cancel_cd": "AA"}]}
{"transaction_type": "rewrite", "policynum": 5634549}
Below is the code
import ast
import csv
with open('test_policy', 'r') as in_f, open('test_policy.csv', 'w') as out_f:
data = in_f.readlines()
writer = csv.DictWriter(
out_f,
fieldnames=[
'transaction_type', 'policynum', 'cancel_cd','reason'],lineterminator='\n',
extrasaction='ignore')
writer.writeheader()
for row in data:
dict_row = ast.literal_eval(row)
if 'cancel_table' in dict_row:
cancel_table = dict_row['cancel_table']
cancel_cd= []
for cancel_row in cancel_table:
cancel_cd.append(cancel_row['cancel_cd'])
dict_row['cancel_cd'] = ','.join(cancel_cd)
writer.writerow(dict_row)
Below is my output not considering the junk line,blank line and transaction type "rewrite".
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with
the Service"
cancel,49494949,"AU,AA",
Expected output
transaction_type,policynum,cancel_cd,reason
new,4994949,,
renewal,3848848,,"Impressed with the Service"
cancel,49494949,"AU,AA",
Hmm I try to fix them but I do not know how CSV file work, but my small knoll age will suggest you to run this code before to convert the file.
txt = {"transaction_type": "renewal",
"policynum": 3848848,
"reason": "Impressed with \n the Service"}
newTxt = {}
for i,j in txt.items():
# local var (temporar)
lastX = ""
correctJ = ""
# check if in J is ascii white space "\n" and get it out
if "\n" in f"b'{j}'":
j = j.replace("\n", "")
# for grammar purpose check if
# J have at least one space
if " " in str(j):
# if yes check it closer (one by one)
for x in ([j[y:y+1] for y in range(0, len(j), 1)]):
# if 2 spaces are consecutive pass the last one
if x == " " and lastX == " ":
pass
# if not update correctJ with new values
else:
correctJ += x
# remember what was the last value checked
lastX = x
# at the end make J to be the correctJ (just in case J has not grammar errors)
j = correctJ
# add the corrections to a new dictionary
newTxt[i]=j
# show the resoult
print(f"txt = {txt}\nnewTxt = {newTxt}")
Termina:
txt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with \n the Service'}
newTxt = {'transaction_type': 'renewal', 'policynum': 3848848, 'reason': 'Impressed with the Service'}
Process finished with exit code 0

Python3 decode removes white spaces when should be kept

I'm reading a binary file that has a code on STM32. I placed deliberate 2 const strings in the code, that allows me to read SW version and description from a given file.
When you open a binary file with hex editor or even in python3, you can see correct form. But when run text = data.decode('utf-8', errors='ignore'), it removes a zeros from the file! I don't want this, as I keep EOL characters to properly split and extract string that interest me.
(preview of the end of the data variable)
Svc\x00..\Src\adc.c\x00..\Src\can.c\x00defaultTask\x00Task_CANbus_receive\x00Task_LED_Controller\x00Task_LED1_TX\x00Task_LED2_RX\x00Task_PWM_Controller\x00**SW_VER:GN_1.01\x00\x00\x00\x00\x00\x00MODULE_DESC:generic_module\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00**Task_SuperVisor_Controller\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x02\x03\x04\x06\x07\x08\t\x00\x00\x00\x00\x01\x02\x03\x04..\Src\tim.c\x005!\x00\x08\x11!\x00\x08\x01\x00\x00\x00\xaa\xaa\xaa\xaa\x01\x01\nd\x00\x02\x04\nd\x00\x00\x00\x00\xa2J\x04'
(preview of text, i.e. what I receive after decode)
r g # IDLE TmrQ Tmr Svc ..\Src\adc.c ..\Src\can.c
defaultTask Task_CANbus_receive Task_LED_Controller Task_LED1_TX
Task_LED2_RX Task_PWM_Controller SW_VER:GN_1.01
MODULE_DESC:generic_module
Task_SuperVisor_Controller ..\Src\tim.c 5! !
d d J
with open(path_to_file, "rb") as binary_file:
# Read the whole file at once
data = binary_file.read()
text = data.decode('utf-8', errors='ignore')
# get index of "SW_VER:" sting in the file
sw_ver_index = text.rfind("SW_VER:")
# SW_VER found
if sw_ver_index is not -1:
# retrive the value, e.g. "SW_VER:WB_2.01" will has to start from position 7 and finish at 14
sw_ver_value = text[sw_ver_index + 7:sw_ver_index + 14]
module.append(tuple(('DESC:', sw_ver_value)))
else:
# SW_VER not found
module.append(tuple(('DESC:', 'N/A')))
# get index of "MODULE_DESC::" sting in the file
module_desc_index = text.rfind("MODULE_DESC:")
# MODULE_DESC found
if module_desc_index is not -1:
module_desc_substring = text[module_desc_index + 12:]
module_desc_value = module_desc_substring.split()
module.append(tuple(('DESC:', module_desc_value[0])))
print(module_desc_value[0])
As you can see my white characters are gone, while they should be present

Unknown column added in user input form

I have a simple data entry form that writes the inputs to a csv file. Everything seems to be working ok, except that there are extra columns being added to the file in the process somewhere, seems to be during the user input phase. Here is the code:
import pandas as pd
#adds all spreadsheets into one list
Batteries= ["MAT0001.csv","MAT0002.csv", "MAT0003.csv", "MAT0004.csv",
"MAT0005.csv", "MAT0006.csv", "MAT0007.csv", "MAT0008.csv"]
#User selects battery to log
choice = (int(input("Which battery? (1-8):")))
def choosebattery(c):
done = False
while not done:
if(c in range(1,9)):
return Batteries[c]
done = True
else:
print('Sorry, selection must be between 1-8')
cfile = choosebattery(choice)
cbat = pd.read_csv(cfile)
#Collect Cycle input
print ("Enter Current Cycle")
response = None
while response not in {"Y", "N", "y", "n"}:
response = input("Please enter Y or N: ")
cy = response
#Charger input
print ("Enter Current Charger")
response = None
while response not in {"SC-G", "QS", "Bosca", "off", "other"}:
response = input("Please enter one: 'SC-G', 'QS', 'Bosca', 'off', 'other'")
if response == "other":
explain = input("Please explain")
ch = response + ":" + explain
else:
ch = response
#Location
print ("Enter Current Location")
response = None
while response not in {"Rack 1", "Rack 2", "Rack 3", "Rack 4", "EV001", "EV002", "EV003", "EV004", "Floor", "other"}:
response = input("Please enter one: 'Rack 1 - 4', 'EV001 - 004', 'Floor' or 'other'")
if response == "other":
explain = input("Please explain")
lo = response + ":" + explain
else:
lo = response
#Voltage
done = False
while not done:
choice = (float(input("Enter Current Voltage:")))
modchoice = choice * 10
if(modchoice in range(500,700)):
vo = choice
done = True
else:
print('Sorry, selection must be between 50 and 70')
#add inputs to current battery dataframe
log = pd.DataFrame([[cy,ch,lo,vo]],columns=["Cycle", "Charger", "Location", "Voltage"])
clog = pd.concat([cbat,log], axis=0)
clog.to_csv(cfile, index = False)
pd.read_csv(cfile)
And I receive:
Out[18]:
Charger Cycle Location Unnamed: 0 Voltage
0 off n Floor NaN 50.0
Where is the "Unnamed" column coming from?
There's an 'unnamed' column coming from your csv. The reason most likely is that the lines in your input csv files end with a comma (i.e. your separator), so pandas interprets that as an additional (nameless) column. If that's the case, check whether your lines end with your separator. For example, if your files are separated by commas:
Column1,Column2,Column3,
val_11, val12, val12,
...
Into:
Column1,Column2,Column3
val_11, val12, val12
...
Alternatively, try specifying the index column explicitly as in this answer. I believe some of the confusion stems from pandas concat reordering your columns .

rearanging data from duplicates in excel

I have an excel sheet with a list of mothers and students,
Every record shows just one student, but some of the moms have more than one student - meaning they have multiple records.
How do i move the extra students for every mom to another column on the mom's row ?
example:
to this:
Here is a short ruby script that does that (thanks to Yanik Rabe)
(The pluralize method in the beginning is not really necessary, just makes prettier error messages...)
#!/usr/bin/env ruby
# The CSV file to read from
INPUT_FILE = 'sample.csv'
# The CSV file to write to
# This can be the same as INPUT_FILE
# Existing files will be overwritten
OUTPT_FILE = 'sample.out.csv'
# The total number of cells in each row
CELL_COUNT = 10
# The number of cells for each student
# This defines how many columns contain
# student data and will therefore be
# moved to the matching row.
STDT_CELLS = 4
# ----- End of configuration -----
require 'csv'
lines = CSV.read INPUT_FILE
class String
def pluralize(n, plural = nil)
noun = \
if n == 1
dup
elsif plural
plural
else
"#{self}s"
end
"#{n} #{noun}"
end
end
class Array
def real_length
last_index = 0
(0..length).each do |i|
last_index = i if self[i] and not self[i].to_s.empty?
end
last_index + 1
end
end
lines.each_with_index do |line, i|
next unless line
if line.length < CELL_COUNT
missing = CELL_COUNT - line.length
noun = 'missing cell'.pluralize missing
STDERR.puts "Warning: #{noun} on line #{i+1} will be filled with an empty string in the output"
while line.length < CELL_COUNT
line << ''
end
end
# Check for other entries with the same parent email
lines.each_with_index do |dline, di|
next unless dline
next if i == di
next if dline.empty?
if line.first == dline.first
((CELL_COUNT - STDT_CELLS)..(CELL_COUNT - 1)).each do |i|
line[line.real_length] = dline[i]
end
lines[di] = nil
end
end
end
CSV.open OUTPT_FILE, 'wb' do |csv|
lines.each do |line|
csv << line if line
end
end
puts "Read #{INPUT_FILE} (#{lines.length} lines), wrote #{OUTPT_FILE} (#{lines.compact.length} lines). Have a nice day!"

How to convert cmudict-0.7b or cmudict-0.7b.dict in to FST format to use it with phonetisaurus?

I am looking for a simple procedure to generate FST (finite state transducer) from cmudict-0.7b or cmudict-0.7b.dict, which will be used with phonetisaurus.
I tried following set of commands (phonetisaurus Aligner, Google NGramLibrary and phonetisaurus arpa2wfst) and able to generate FST but it didn't work. I am not sure where I did a mistake or miss any step. I guess very first command ie phonetisaurus-align, is not correct.
phonetisaurus-align --input=cmudict.dict --ofile=cmudict/cmudict.corpus --seq1_del=false
ngramsymbols < cmudict/cmudict.corpus > cmudict/cmudict.syms
/usr/local/bin/farcompilestrings --symbols=cmudict/cmudict.syms --keep_symbols=1 cmudict/cmudict.corpus > cmudict/cmudict.far
ngramcount --order=8 cmudict/cmudict.far > cmudict/cmudict.cnts
ngrammake --v=2 --bins=3 --method=kneser_ney cmudict/cmudict.cnts > cmudict/cmudict.mod
ngramprint --ARPA cmudict/cmudict.mod > cmudict/cmudict.arpa
phonetisaurus-arpa2wfst-omega --lm=cmudict/cmudict.arpa > cmudict/cmudict.fst
I tried fst with phonetisaurus-g2p as follows:
phonetisaurus-g2p --model=cmudict/cmudict.fst --nbest=3 --input=HELLO --words
But it didn't return anything....
Appreciate any help on this matter.
It is very important to keep dictionary in the right format. Phonetisaurus is very sensitive about that, it requires word and phonemes to be tab separated, spaces would not work then. It also does not allow pronunciation variant numbers CMUSphinx uses like (2) or (3). You need to cleanup dictionary with simple python script for example before feeding it into phonetisaurus. Here is the one I use:
#!/usr/bin/python
import sys
if len(sys.argv) != 3:
print "Split the list on train and test sets"
print
print "Usage: traintest.py file split_count"
exit()
infile = open(sys.argv[1], "r")
outtrain = open(sys.argv[1] + ".train", "w")
outtest = open(sys.argv[1] + ".test", "w")
cnt = 0
split_count = int(sys.argv[2])
for line in infile:
items = line.split()
if items[0][-1] == ')':
items[0] = items[0][:-3]
if items[0].find("_") > 0:
continue
line = items[0] + '\t' + " ".join(items[1:]) + '\n'
if cnt % split_count == 3:
outtest.write(line)
else:
outtrain.write(line)
cnt = cnt + 1

Resources