The code works incorrectly and displays incomprehensible numbers: - python-3.x

The code works incorrectly and displays incomprehensible numbers:
3.612432554629948e + 76
Displays incomprehensible values,it should be like this:
And should output:
A large number for example:
HEX: 8459f630cd86ddfa329b3d13d5217d45df1d5e9a56a63f6a3d7ab8b794c35c12
DEC: 59864244547079690871082685810675850360550404961977540588162601013229404773394
# Covert a file with lines of hex values in "hex.txt" to decimal
# values and write to "dec.txt"
# NOTE: This program only handles Value errors and replaces them
# with the string XXXXX, no other error handling is performed
HEXLIST_FILENAME = "hex.txt"
DECLIST_FILENAME = "dec.txt"
def loadHex():
"""
Returns a list of Hex strings from a file
"""
hexList = []
print ("Loading hex list from file...")
try:
inFile = open(HEXLIST_FILENAME, 'r')
except IOError:
print('No such file "hex.txt"')
#more error handeling here
for line in inFile:
hexList.append(line.strip().upper())
print (len(hexList)), "Numbers loaded."
return hexList
def hexToDec(hexString):
"""
Takes in a string representing a hex value
Returns a decimal number string
"""
try:
i=int(hexString,16)
except ValueError:
print('Oops! There was an invalid number in hex.txt...')
print('Invalid number replaced with XXXXX')
i='XXXXX'
return str(i/float(2**21))
def exportDec(decList):
"""
"""
outFile=open(DECLIST_FILENAME,'w')
for num in decList:
outFile.write(num+"\n")
outFile.close()
print ("Success! Decimal numbers written to dec.txt")
decList = []
hexVals=loadHex()
for hexnum in hexVals:
decList.append(hexToDec(hexnum))
exportDec(decList)
s=input()('Press Enter to exit...')

Not sure why you are dividing the converted value by 2**21 in your hexToDec method:
return str(i/float(2**21))
The division is unnecessary and you would get the correct output if you just
return str(i)
directly from your hexToDec method.

Related

Remove Carriage Return from the final print statement

import re
import sys
def isValid(s):
pattern_= re.compile("[12][\d]{12}$")
return pattern_.match(s)
loop = int(input ())
output=[]
for _ in range(0, loop):
ele = int(input())
output.append(ele)
entries = ''
for x in output :
entries += str(x)+ ''
print (output ) #['0123456789012']
print (entries ) #0123456789012
print(type(entries )) #str
print(type(output )) #list
# Driver Code
for _ in range(loop):
for x in entries:
if (isValid(x)):
sys.stdout.write ("Valid Number")
break
else :
sys.stdout.write ("Invalid Number")
break
Phones Numbers starts with the digit 1 or 2 followed by exactly 12 digits i.e Phones Numbers comprises of 13 digits.
For each Phone Number, print "Valid" or "Invalid" in a new line.
The list is taking wrong input
The output generated is,
2
0123456789012
1123456789012
[123456789012, 1123456789012]
123456789012 1123456789012
<class 'str'>
<class 'list'>
Invalid NumberInvalid Number
[Program finished]
Also, I have searched on stack before posting. This looked different issue. If anything matches the error on stack please redirect me there.
2
1123456789012
0123456778901
Valid Number
Invalid Number
[Program finished]
This is what it should look like
import re
def isValid(s):
pattern_= re.compile(r'[1|2][0-9]{12}$')
return pattern_.match(s)
loop = int(input())
# no of times loops to run
output = []
for _ in range(0, loop):
output.append(input())
entries = ''
for x in output :
entries += x + ''
result = []
# Driver Code
for val in output:
if isValid(val):
result.append('Valid Number')
else:
result.append ('Invalid Number')
for i in range(len(result )-1):
print(result[i])
print(result[-1], end = " ")
This should work too.
print first converts the object to a string (if it is not already a string). It will also put a space before the object if it is not the start of a line and a newline character at the end.
When using stdout, you need to convert the object to a string yourself (by calling "str", for example) and there is no newline character.
May I also suggest to rephrase your question as it's not a logic issue but a syntax issue.
Comment:
Checked with single and multiple inputs.
Works.
Try using the below regex
def is_valid(s):
pattern_= re.compile(r'[1|2][0-9]{12}$')
return pattern_.match(s)
I am not sure, why you are appending the numbers to the entities variable. I have changed the code a bit and the regex is working fine.
def is_valid(s):
pattern_= re.compile(r'[1|2][0-9]{12}$')
return pattern_.match(s)
loop = int(input())
output = []
for _ in range(0, loop):
output.append(input())
entries = ''
for x in output :
entries += x + ''
print (output ) # ['0123456789012']
print (entries ) # 0123456789012
print(type(entries )) # str
print(type(output )) # list
# Driver Code
for val in output:
if isValid(val):
print('Valid Number')
else:
print('Invalid Number')
Input:
5
1234567891234
1893456879354
2897347838389
0253478642678
6249842352985
Output:
['1234567891234', '1893456879354', '2897347838389', '0253478642678', '6249842352985']
12345678912341893456879354289734783838902534786426786249842352985
<class 'str'>
<class 'list'>
Valid Number
Valid Number
Valid Number
Invalid Number
Invalid Number
import sys
import re
def isValid(s):
pattern_= re.compile(r'[1|2][0-9]{12}$')
return pattern_.match(s)
loop = int(input())
output = []
for _ in range(0, loop):
output.append(input())
entries = ''
for x in output :
entries += x + ''
print (output ) # ['0123456789012']
print (entries ) # 0123456789012
print(type(entries )) # str
print(type(output )) # list
# Driver Code
for val in output:
if isValid(val):
sys.stdout.write('Valid Number')
else:
sys.stdout.write('Invalid Number')
produces
1
1234567891234
['1234567891234']
1234567891234
<class 'str'>
<class 'list'>
Valid Number
[Program finished]
print always returns carriage.
Whereas sys.stdout.write doesn't.
The challenge was resolved hence.

Convert all strings in a list to float. Works on single list but not when applied to dataframe

I've got a dataframe df_tweets with geolocation. The geolocation is stored in a variable geo_loc as a string representation of a list. It looks like this:
# Geocode values are stored as objects/strings
df_tweets.geo_code[0]
#Output:
'[-4.241751 55.858303]'
I tested converting one row of geo_code to a list of longitude-latitude as floats:
# Converting string representation of list to list using strip and split
# Can't use json.loads() or ast.literal_eval() because there's no comma delimiter
#--- Test with one tweet ----#
ini_list = df_tweets.geo_code[0]
# Converting string to list, but it will convert
# the lon and lat values to strings
# i.e. ['-4.241751', '55.858303']
results = ini_list.strip('][').split(' ')
# So, we must convert string lon and lat to floats
results = list(map(float, results))
# printing final result and its type
print ("final list", results)
print (type(result))
This gives me:
# Output:
final list [-4.241751, 55.858303]
<class 'list'>
Success! Except no. I wrote it as a helper function:
def str_to_float_list(list_as_str):
'''
Function to convert a string representation
of a list into a list of floats
using strip and split, when you can't use json.loads()
or ast.literal_eval() because there's no comma delimiter
Parameter:
str_ = string representation of a list.
'''
# Convert string to list
str_list = list_as_str.strip('][').split(' ')
# Convert strings inside list to float
float_list = list(map(float, str_list[0]))
return float_list
And when I run:
df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
it gives me a ValueError when it encounters the minus - sign. I can't figure out why?! What am I missing?
Here's the full error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-94-c1035312dc12> in <module>()
20
21
---> 22 df_tweets['geocode'] = df_tweets['geo_code'].apply(str_to_float_list)
1 frames
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-94-c1035312dc12> in str_to_float_list(list_as_str)
15
16 # Convert strings inside list to float
---> 17 float_list = list(map(float, str_list[0]))
18
19 return float_list
ValueError: could not convert string to float: '-'
On your line 17,
float_list = list(map(float, str_list[0]))
You do not need to reference the index. Pass the list the entire list, like this.
float_list = list(map(float, str_list))
The reason for this is str_list[0] is a string object, so it is trying to treat it like a list, and converting each value iteratively, starting with converting "-" to a float, then it would convert "4," etc.

pd.rename key KeyError: 'New_Name'

Edit 12/07/19: The problem was not in fact with pd.rename fuction but the fact that I did not return from the function the pandas dataframe and as a result the column change did not exist when printing. i.e.
def change_column_names(as_pandas, old_name, new_name):
as_pandas.rename(columns={old_name: new_name}, inplace=)
return as_pandas <- This was missing*
Please see the user comment below to uptick them for finding this error for me.
Alternatively, you can continue reading.
The data can be downloaded from this link, yet I have added a sample dataset. The formatting of the file is not a typical CSV file and I believe this may have been an assessment piece and is related to Hidden Decision Tree article. I have given the portion of the code as it solves the issues surrounding the format of the text file as mentioned above and allows the user to rename the column.
The problem occured when I tried to assign create a re-naming function:
def change_column_names(as_pandas, old_name, new_name):
as_pandas.rename(columns={old_name: new_name}, inplace=)
However, it seem to work when I set the variable names inside rename function.
def change_column_names(as_pandas):
as_pandas.rename(columns={'Unique Pageviews': 'Page_Views'}, inplace=True)
return as_pandas
Sample Dataset
Title URL Date Unique Pageviews
oupUrl=tutorials 18-Apr-15 5608
"An Exclusive Interview with Data Expert, John Bottega" http://www.datasciencecentral.com/forum/topics/an-exclusive-interview-with-data-expert-john-bottega?groupUrl=announcements 10-Jun-14 360
Announcing Composable Analytics http://www.datasciencecentral.com/forum/topics/announcing-composable-analytics 15-Jun-14 367
Announcing the release of Spark 1.5 http://www.datasciencecentral.com/forum/topics/announcing-the-release-of-spark-1-5 12-Sep-15 156
Are Extreme Weather Events More Frequent? The Data Science Answer http://www.datasciencecentral.com/forum/topics/are-extreme-weather-events-more-frequent-the-data-science-answer 5-Oct-15 204
Are you interested in joining the University of California for an empiricalstudy on 'Big Data'? http://www.datasciencecentral.com/forum/topics/are-you-interested-in-joining-the-university-of-california-for-an 7-Feb-13 204
Are you smart enough to work at Google? http://www.datasciencecentral.com/forum/topics/are-you-smart-enough-to-work-at-google 11-Oct-15 3625
"As a software engineer, what's the best skill set to have for the next 5-10years?" http://www.datasciencecentral.com/forum/topics/as-a-software-engineer-what-s-the-best-skill-set-to-have-for-the- 12-Feb-16 2815
A Statistician's View on Big Data and Data Science (Updated) http://www.datasciencecentral.com/forum/topics/a-statistician-s-view-on-big-data-and-data-science-updated-1 21-May-14 163
A synthetic variance designed for Hadoop and big data http://www.datasciencecentral.com/forum/topics/a-synthetic-variance-designed-for-hadoop-and-big-data?groupUrl=research 26-May-14 575
A Tough Calculus Question http://www.datasciencecentral.com/forum/topics/a-tough-calculus-question 10-Feb-16 937
Attribution Modeling: Key Analytical Strategy to Boost Marketing ROI http://www.datasciencecentral.com/forum/topics/attribution-modeling-key-concept 24-Oct-15 937
Audience expansion http://www.datasciencecentral.com/forum/topics/audience-expansion 6-May-13 223
Automatic use of insights http://www.datasciencecentral.com/forum/topics/automatic-use-of-insights 27-Aug-15 122
Average length of dissertations by higher education discipline. http://www.datasciencecentral.com/forum/topics/average-length-of-dissertations-by-higher-education-discipline 4-Jun-15 1303
This is the full code that produces the Key Error:
def change_column_names(as_pandas):
as_pandas.rename(columns={'Unique Pageviews': 'Page_Views'}, inplace=True)
def change_column_names(as_pandas, old_name, new_name):
as_pandas.rename(columns={old_name: new_name}, inplace=True)
def change_column_names(as_pandas):
as_pandas.rename(columns={'Unique Pageviews': 'Page_Views'},
inplace=True)
def open_as_dataframe(file_name_in):
reader = pd.read_csv(file_name_in, encoding='windows-1251')
return reader
# Get each column of data including the heading and separate each element
i.e. Title, URL, Date, Page Views
# and save to string_of_rows with comma separator for storage as a csv
# file.
def get_columns_of_data(*args):
# Function that accept variable length arguments
string_of_rows = str()
num_cols = len(args)
try:
if num_cols > 0:
for number, element in enumerate(args):
if number == (num_cols - 1):
string_of_rows = string_of_rows + element + '\n'
else:
string_of_rows = string_of_rows + element + ','
except UnboundLocalError:
print('Empty file \'or\' No arguments received, cannot be zero')
return string_of_rows
def open_file(file_name):
try:
with open(file_name) as csv_file_in, open('HDT_data5.txt', 'w') as csv_file_out:
csv_read = csv.reader(csv_file_in, delimiter='\t')
for row in csv_read:
try:
row[0] = row[0].replace(',', '')
csv_file_out.write(get_columns_of_data(*row))
except TypeError:
continue
print("The file name '{}' was successfully opened and read".format(file_name))
except IOError:
print('File not found \'OR\' Not in current directory\n')
# All acronyms used in variable naming correspond to the function at time
# of return from function.
# csv_list being a list of the v file contents the remainder i.e. 'st' of
# csv_list_st = split_title().
def main():
open_file('HDTdata3.txt')
multi_sets = open_as_dataframe('HDT_data5.txt')
# change_column_names(multi_sets)
change_column_names(multi_set, 'Old_Name', 'New_Name')
print(multi_sets)
main()
I cleaned up your code so it would run. You were changing the column names but not returning the result. Try the following:
import pandas as pd
import numpy as np
import math
def set_new_columns(as_pandas):
titles_list = ['Year > 2014', 'Forum', 'Blog', 'Python', 'R',
'Machine_Learning', 'Data_Science', 'Data',
'Analytics']
for number, word in enumerate(titles_list):
as_pandas.insert(len(as_pandas.columns), titles_list[number], 0)
def title_length(as_pandas):
# Insert new column header then count the number of letters in 'Title'
as_pandas.insert(len(as_pandas.columns), 'Title_Length', 0)
as_pandas['Title_Length'] = as_pandas['Title'].map(str).apply(len)
# Although it is log, percentage of change is inverse linear comparison of
#logX1 - logX2
# therefore you could think of it as the percentage change in Page Views
# map
# function allows for function to be performed on all rows in column
# 'Page_Views'.
def log_page_view(as_pandas):
# Insert new column header
as_pandas.insert(len(as_pandas.columns), 'Log_Page_Views', 0)
as_pandas['Log_Page_Views'] = as_pandas['Page_Views'].map(lambda x: math.log(1 + float(x)))
def change_to_numeric(as_pandas):
# Check for missing values then convert the column to numeric.
as_pandas = as_pandas.replace(r'^\s*$', np.nan, regex=True)
as_pandas['Page_Views'] = pd.to_numeric(as_pandas['Page_Views'],
errors='coerce')
def change_column_names(as_pandas):
as_pandas.rename(columns={'Unique Pageviews': 'Page_Views'}, inplace=True)
return as_pandas
def open_as_dataframe(file_name_in):
reader = pd.read_csv(file_name_in, encoding='windows-1251')
return reader
# Get each column of data including the heading and separate each element
# i.e. Title, URL, Date, Page Views
# and save to string_of_rows with comma separator for storage as a csv
# file.
def get_columns_of_data(*args):
# Function that accept variable length arguments
string_of_rows = str()
num_cols = len(args)
try:
if num_cols > 0:
for number, element in enumerate(args):
if number == (num_cols - 1):
string_of_rows = string_of_rows + element + '\n'
else:
string_of_rows = string_of_rows + element + ','
except UnboundLocalError:
print('Empty file \'or\' No arguments received, cannot be zero')
return string_of_rows
def open_file(file_name):
import csv
try:
with open(file_name) as csv_file_in, open('HDT_data5.txt', 'w') as csv_file_out:
csv_read = csv.reader(csv_file_in, delimiter='\t')
for row in csv_read:
try:
row[0] = row[0].replace(',', '')
csv_file_out.write(get_columns_of_data(*row))
except TypeError:
continue
print("The file name '{}' was successfully opened and read".format(file_name))
except IOError:
print('File not found \'OR\' Not in current directory\n')
# All acronyms used in variable naming correspond to the function at time
# of return from function.
# csv_list being a list of the v file contents the remainder i.e. 'st' of
# csv_list_st = split_title().
def main():
open_file('HDTdata3.txt')
multi_sets = open_as_dataframe('HDT_data5.txt')
multi_sets = change_column_names(multi_sets)
change_to_numeric(multi_sets)
log_page_view(multi_sets)
title_length(multi_sets)
set_new_columns(multi_sets)
print(multi_sets)
main()

Markov analysis - Return and recursion role

I am working on the solution of the Markov analysis in Think Python, but I do not understand the role of "Return" in the block code below.
As far as I known when the code reach return the function is cancel immediately, but isn't it unnecessary in this case, because there is a recursion here random_text(n-i) before the code reach the return statement, so the function will cancel only when the recursion is finish which mean when the for loop is over?? The question seem stupid but I am newbie in python and the recursion stuff is really confusing with me. I try to remove 'return' and it still run well.
def random_text(n=100):
start = random.choice(list(suffix_map.keys()))
for i in range(n):
suffixes = suffix_map.get(start, None)
if suffixes == None:
# if the start isn't in map, we got to the end of the
# original text, so we have to start again.
random_text(n-i)
return
word = random.choice(suffixes)
print(word, end=' ')
start = shift(start, word)
The full code is as below so you can understand what each function do.
from __future__ import print_function, division
import os
os.chdir(r"C:\Users\Hoang-Ngoc.Anh\Documents\WinPython-64bit 3.4.4.2\notebooks\docs")
import sys
import string
import random
# global variables
suffix_map = {} # map from prefixes to a list of suffixes
prefix = () # current tuple of words
def process_file(filename, order=2):
"""Reads a file and performs Markov analysis.
filename: string
order: integer number of words in the prefix
returns: map from prefix to list of possible suffixes.
"""
fp = open(filename)
skip_gutenberg_header(fp)
for line in fp:
for word in line.rstrip().split():
process_word(word, order)
def skip_gutenberg_header(fp):
"""Reads from fp until it finds the line that ends the header.
fp: open file object
"""
for line in fp:
if line.startswith('*END*THE SMALL PRINT!'):
break
def process_word(word, order=2):
"""Processes each word.
word: string
order: integer
During the first few iterations, all we do is store up the words;
after that we start adding entries to the dictionary.
"""
global prefix
if len(prefix) < order:
prefix += (word,)
return
try:
suffix_map[prefix].append(word)
except KeyError:
# if there is no entry for this prefix, make one
suffix_map[prefix] = [word]
prefix = shift(prefix, word)
def random_text(n=100):
"""Generates random wordsfrom the analyzed text.
Starts with a random prefix from the dictionary.
n: number of words to generate
"""
# choose a random prefix (not weighted by frequency)
start = random.choice(list(suffix_map.keys()))
for i in range(n):
suffixes = suffix_map.get(start, None)
if suffixes == None:
# if the start isn't in map, we got to the end of the
# original text, so we have to start again.
random_text(n-i)
return
# choose a random suffix
word = random.choice(suffixes)
print(word, end=' ')
start = shift(start, word)
def shift(t, word):
"""Forms a new tuple by removing the head and adding word to the tail.
t: tuple of strings
word: string
Returns: tuple of strings
"""
return t[1:] + (word,)
def main(script, filename='emma.txt', n=100, order=2):
try:
n = int(n)
order = int(order)
except ValueError:
print('Usage: %d filename [# of words] [prefix length]' % script)
else:
process_file(filename, order)
random_text(n)
print()
if __name__ == '__main__':
main(*sys.argv)

Why won't it write to the file?

So I have the code:
def logdata(x, y):
try:
f = open('multlog.txt', 'a')
f.write("{0:g} * {1:g} = {2:g}\n".format(x,y, (x*y)))
except ValueError:
f.write("Error, you tried to multiply by something that wasn't a number")
raise
finally:
f.close()
print("This is a test program, it logs data in a text file, 'multlog.txt'")
fn = input("Enter the first number you'd like to multiply by: ")
sn = input("Enter the second number you'd like to multiply by: ")
logdata(int(fn), int(sn))
And what I want it to do, is when it reaches a value error, for it to write to the file,"Error, you tried to multiply by something that wasn't a number". But, if the file reaches a value error if the user inputs a letter, say "j",ValueError: invalid literal for int() with base 10: 'j', it doesn't write to the file!
At least two problems:
The file is not open for writing (or appending) in the except block.
As #DSM points out in a comment, the ValueError is being raised when you call int()
I would rewrite to something like the below example.
If you use the with statement then you can do without the finally block.
def logdata(x, y):
with open('multlog.txt', 'a') as f:
try:
x = int(x); y = int(y)
f.write("{0:g} * {1:g} = {2:g}\n".format(x,y, (x*y)))
except ValueError:
f.write("Error")
print("This is a test program, it logs data in a text file, 'multlog.txt'")
fn = input("Enter the first number you'd like to multiply by: ")
sn = input("Enter the second number you'd like to multiply by: ")
logdata(fn, sn)

Resources