I want to convert this
(554, 334, 24, 15)
to
[554', ' 334', ' 24', ' 15]
If they are a similar question then sorry i didnt find one.
print(list(map(str, list((554, 334, 24, 15)))))
result = ['554', '334', '24', '15']
The tuple is converted into a list(), then map() applies the same function to each element of the list, in this case converting the int-type list elements to a string, the resulting list after this conversion is then printed.
Related
I have not used pandas before but it looks like it could be a really nice tool for data manipulation. I am using python 3.7 and pandas 1.2.3.
I am passing a list of dictionaries to the dataframe that has 2 pieces to it. A sample of the dictionary would look like this:
data = [
{"Knowledge_Base_Link__c": null, "ClosedDate": "2021-01-06T19:02:14.000+0000"},
{"Knowledge_Base_Link__c": "http://someurl.com", "ClosedDate": "2021-01-08T21:26:49.000+0000"},
{"Knowledge_Base_Link__c": "http://someotherurl.com", "ClosedDate": "2021-01-09T20:35:58.000+0000"}
]
df = pd.DataFrame(data)
# Then I format the ClosedDate like so
df['ClosedDate'] = pd.to_datetime(df['ClosedDate'], format="%y-%m-%d", exact=False)
# Next i get a count of the data
articles = df.resample('M', on='ClosedDate').count()
# print the results to the console
print(articles)
These are the results and exactly what i want.
However, if i convert that to a list or when i push it to a dictionary to use the data like below, the first column (index i presume) is missing from the output.
articles_by_month = articles.to_dict('records')
This final output is almost what i want but it is missing the index column.
This is what i am getting:
[{'ClosedDate': 15, 'Knowledge_Base_Link__c': 5}, {'ClosedDate': 18, 'Knowledge_Base_Link__c': 11}, {'ClosedDate': 12, 'Knowledge_Base_Link__c': 6}]
This is what i want:
[{'Date': '2021-01-31', 'ClosedDate': 15, 'Knowledge_Base_Link__c': 5}, {'Date': '2021-02-28', 'ClosedDate': 18, 'Knowledge_Base_Link__c': 11}, {'Date': '2021-03-31', 'ClosedDate': 12, 'Knowledge_Base_Link__c': 6}]
Couple things i have tried:
df.reset_index(level=0, inplace=True)
# This just takes the sum and puts it in a column called index, not sure how to get date like it is displayed in the first column of the screenshot
# I also tried this
df['ClosedDate'] = df.index
# however this gives me a Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index' error.
I thought this would be simple and checked the pandas docs and many other stacked articles but i cannot find a way to do this. Any thoughts on this would be appreciated.
Thanks
You can get an additional key in the dict with
articles.reset_index().to_dict('records')
But BEFORE that you have to rename your index since ClosedDate (the index' name) is already a column:
articles.index = articles.index.rename('Date')
I currently have a list of values and an awkward array of integer values. I want the same dimension awkward array, but where the values are the indices of the "values" arrays corresponding with the integer values of the awkward array. For instance:
values = ak.Array(np.random.rand(100))
arr = ak.Array((np.random.randint(0, 100, 33), np.random.randint(0, 100, 125)))
I want something like values[arr], but that gives the following error:
>>> values[arr]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda3\lib\site-packages\awkward\highlevel.py", line 943, in __getitem__
return ak._util.wrap(self._layout[where], self._behavior)
ValueError: cannot fit jagged slice with length 2 into RegularArray of size 100
If I run it with a loop, I get back what I want:
>>> values = ([values[i] for i in arr])
>>> values
[<Array [0.842, 0.578, 0.159, ... 0.726, 0.702] type='33 * float64'>, <Array [0.509, 0.45, 0.202, ... 0.906, 0.367] type='125 * float64'>]
Is there another way to do this, or is this it? I'm afraid it'll be too slow for my application.
Thanks!
If you're trying to avoid Python for loops for performance, note that the first line casts a NumPy array as Awkward with ak.from_numpy (no loop, very fast):
>>> values = ak.Array(np.random.rand(100))
but the second line iterates over data in Python (has a slow loop):
>>> arr = ak.Array((np.random.randint(0, 100, 33), np.random.randint(0, 100, 125)))
because a tuple of two NumPy arrays is not a NumPy array. It's a generic iterable, and the constructor falls back to ak.from_iter.
On your main question, the reason that arr doesn't slice values is because arr is a jagged array and values is not:
>>> values
<Array [0.272, 0.121, 0.167, ... 0.152, 0.514] type='100 * float64'>
>>> arr
<Array [[15, 24, 9, 42, ... 35, 75, 20, 10]] type='2 * var * int64'>
Note the types: values has type 100 * float64 and arr has type 2 * var * int64. There's no rule for values[arr].
Since it looks like you want to slice values with arr[0] and then arr[1] (from your list comprehension), it could be done in a vectorized way by duplicating values for each element of arr, then slicing.
>>> # The np.newaxis is to give values a length-1 dimension before concatenating.
>>> duplicated = ak.concatenate([values[np.newaxis]] * 2)
>>> duplicated
<Array [[0.272, 0.121, ... 0.152, 0.514]] type='2 * 100 * float64'>
Now duplicated has length 2 and one level of nesting, just like arr, so arr can slice it. The resulting array also has length 2, but the length of each sublist is the length of each sublist in arr, rather than 100.
>>> duplicated[arr]
<Array [[0.225, 0.812, ... 0.779, 0.665]] type='2 * var * float64'>
>>> ak.num(duplicated[arr])
<Array [33, 125] type='2 * int64'>
If you're scaling up from 2 such lists to a large number, then this would eat up a lot of memory. Then again, the size of the output of this operation would also scale as "length of values" × "length of arr". If this "2" is not going to scale up (if it will be at most thousands, not millions or more), then I wouldn't worry about the speed of the Python for loop. Python scales well for thousands, but not billions (depending, of course, on the size of the things being scaled!).
import math
array = [16,5,3,4,11,9,13]
for x in array[0:len(array)-1]:
key=x
index=array.index(x)
posj=index
for y in array[index+1:len(array)]:
if y<key:
key=y
posj=array.index(y)
if index!=posj:
hold=array[index]
array[index]=key
array[posj]=hold
print(array)
I'm trying to implement insertion sort.
It appears after using the debugger that in every loop iteration, it is using the array [16,5,3,4,11,9,13] instead of the updated array that results after a loop iteration.
How can I make x be the updated element for the given indicie?
Instead of
for x in array[0:len(array)-1]:
try
for x in array:
Output
[3, 4, 5, 9, 11, 13, 16]
I have two arguments that I want to print
print('{0:25}${2:>5.2f}'.format('object', 20))
But they give the following response:
Traceback (most recent call last):
IndexError: tuple index out of range
But I get the desired output when I changed the code to the following:
print('{0:25}${2:>5.2f}'.format('object', 20, 20))
I don't understand why as I only have two sets of {}. Thanks
your problem is the 2 index after the $ sign:
print('{0:25}${2:>5.2f}'.format('object', 20, 20))
when you use .format in on string in python the number at {number:} is the index for the argument you want there.
for example the following:
"hello there {1:} i want you to give me {0:} dollars".format(2,"Tom")
will resualt in the following output:
'hello there Tom i want you to give me 2 dollars'
there is a simple example here:
https://www.programiz.com/python-programming/methods/string/format
so to sum up, in order for your code to work just use:
print('{0:25}${1:>5.2f}'.format('object', 20))
It should be
>>> print('{0:25}${1:>5.2f}'.format('object', 20))
object $20.00
Note the change of the placeholder from 2 to 1
print('{0:25}${1:>5.2f}'.format('object', 20))
### ^
When you add a third parameter (a second 20), the placeholder 2 finds a value
>>> print('{0:25}${2:>5.2f}'.format('object', 20, 20))
object $20.00
But without the third parameter, an index out of range exception is thrown.
I have these arrays to assign into a pandata frame.
date_quote = []
price1 = []
price2 = []
The arrays have been filled with values. price1[], price2[] contains floating values while date_quote[] contains datetype values.
This is how I assign the arrays into the panda dataframe.
df = pd.DataFrame({'price1 ':price1 ,
'price2 ': price2 ,
'date': date_quote
})
I get the following error;
File "pandas\_libs\tslib.pyx", line 492, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 537, in pandas._libs.tslib.array_to_datetime
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
File "pandas\_libs\tslibs\conversion.pyx", line 178, in pandas._libs.tslibs.conversion.datetime_to_datetime64
File "pandas\_libs\tslibs\conversion.pyx", line 387, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject
AttributeError: 'pywintypes.datetime' object has no attribute 'nanosecond'
The problem comes from assigning date_quote[] which is datetime type. The code runs successfully if I do not assign date_quote[] into the dataframe.
Contents of date_quote[1] looks like 2018-07-26 00:00:00+00:00. I only need the date and do not need the time information in date_quote[]. Do I need to do any extra conversion to store this datetime type date_quote[] array into the dataframe?
The output of print (date_quote[:3]) is
[pywintypes.datetime(2018, 7, 26, 0, 0, tzinfo=TimeZoneInfo('GMT Standard Time', True)), pywintypes.datetime(2018, 7, 27, 0, 0, tzinfo=TimeZoneInfo('GMT Standard Time', True)), pywintypes.datetime(2018, 7, 30, 0, 0, tzinfo=TimeZoneInfo('GMT Standard Time', True))]
I am using python v3.6
I found the answer to my own question. The key lies in removing the time information from date_quote[], leaving behind only the date information.
for x in range(0,int(num_elements)):
date_quote[x] = date_quote[x].date()
The assignment works without error after the time information is removed.
You can also use the dateutil module to extract the date and time from the string representation of the pywintypes.datetime object. This way you can keep the time part too. Code below tested with Python 3.6.
import datetime, dateutil, pywintypes
today = datetime.datetime.now() # get today's date as an example
mypydate = pywintypes.Time(today) # create a demo pywintypes object
mydate = dateutil.parser.parse(str(mypydate)) # convert it to a python datetime
print(mydate)
print(type(mydate))
Output:
2019-05-11 12:44:03.320533
<class 'datetime.datetime'>