I'm amazed to not have come across any such site so far.
For example, when I'm using the Python's dict() function. I want to what are the different parameters that it can accept. Something like the following:
return: dict, dict(list)
return: dict, dict(dict)
return: dict, dict(dict + dict)
return: dict, dict(tupple of format = (element=value, element2=value2))
Everywhere I search on the Internet it just brings me to limited examples rather than showing me a signature.
Here on Stackoverflow.com, I came across a question that stated we cannot use dict(dict). That dict() function can only be used with a list.
Is there any site/link that shows most of the ways that dict can be used or the signature of dict in the above format?
Follow the official documentation of Python:
https://docs.python.org/3.6/
You will get most of the ideas from it.
Related
I want to make it like this:
>>> myfunc("strawberry")
ok
# myfunc only works with strawberry
I know that most people will answer with:
def myfunc(something):
if something == "strawberry":
print("ok")
But I want to do all this in the parameter setting.
Like, kind of like this:
def myfunc(something: OnlyThese["strawberry", "cake"]:
print("ok")
Although the code above is very incorrect, I want to see if Python already has a feature like this.
Don't believe there is a way to do what you are wanting to do without writing code in the function body.
I found answers to a similar question at
enforce arguments to a specific list of values
I am looking at Kedro Library as my team are looking into using it for our data pipeline.
While going to the offical tutorial - Spaceflight.
I came across this function:
def preprocess_companies(companies: pd.DataFrame) -> pd.DataFrame:
"""Preprocess the data for companies.
Args:
companies: Source data.
Returns:
Preprocessed data.
"""
companies["iata_approved"] = companies["iata_approved"].apply(_is_true)
companies["company_rating"] = companies["company_rating"].apply(_parse_percentage)
return companies
companies is the name of the csv file containing the data
Looking at the function, my assumption is that (companies: pd.Dafarame) is the shorthand to read the "companies" dataset as a dataframe. If so, I do not understand what does -> pd.Dataframe at the end means
I tried looking at python documentation regarding such style of code but I did not managed to find any
Much help is appreciated to assist me in understanding this.
Thank you
This is tht way of declaring type of your inputs(companies: pd.DataFrame) . Here comapnies is argument and pd.DataFrame is its type . in same way -> pd.DataFrame this is the type of output
Overall they are saying that comapnies of type pd.DataFrame will return pd.DataFrametype variable .
I hope you got it
The -> notation is type hinting, as is the : part in the companies: pd.DataFrame function definition. This is not essential to do in Python but many people like to include it. The function definition would work exactly the same if it didn't contain this but instead read:
def preprocess_companies(companies):
This is a general Python thing rather than anything kedro-specific.
The way that kedro registers companies as a kedro dataset is completely separate from this function definition and is done through the catalog.yml file:
companies:
type: pandas.CSVDataSet
filepath: data/01_raw/companies.csv
There will then a node defined (in pipeline.py) to specify that the preprocess_companies function should take as input the kedro dataset companies:
node(
func=preprocess_companies,
inputs="companies", # THIS LINE REFERS TO THE DATASET NAME
outputs="preprocessed_companies",
name="preprocessing_companies",
),
In theory the name of the parameter in the function itself could be completely different, e.g.
def preprocess_companies(anything_you_want):
... although it is very common to give it the same name as the dataset.
In this situation companies is technically any DataFrame. However, when wrapped in a Kedro Node object the correct dataset will be passed in:
Node(
func=preprocess_companies, # The function posted above
inputs='raw_companies', # Kedro will read from a catalog entry called 'raw companies'
outputs='processed_companies', # Kedro will write to a catalog entry called 'processed_companies'
)
In essence the parameter name isn't really important here, it has been named this way so that the person reading the code knows that it is semantically about companies, but the function name does that too.
The above is technically a simplification since I'm not getting into MemoryDataSets but hopefully it covers the main points.
Hi Stackoverflow community, I am just new in Python I hope you can help me.
I've tried many different programs, but I didn't get any results. Here is one:
import requests
url ="https://bboxxltd.atlassian.net/rest/servicedeskapi/servicedesk/CMS/queue/213/issue"
auth='XXXXXXXXXXXXXXXXXX', 'XXXXXXXXXXXXXXXX'
r = requests.get(url, auth=(auth))
data = r.json().get('summary')
print(data)
Output: None
I wanted to have in the "summary", in this example:
For example:
Output:
summary:REQUEST FOR DATA
When you do for in in data, the i variable will take the value of the keys of data, one at the time. So normally you would do data[i] inside the for i in data.
If id is a high level attribute of data, you can simply do data['id'] outside the for loop. Anyways, this all depends on the structure of the returned JSON.
From the screenshot, you are getting:
KeyError: 'summary'
Which means that data is not an Array but an Object. You need to go down the Object further in order to reach the Array you are looking for. You need to inspect the data object; one good way to do this is to call print(data.keys()), this way, you'll find the attributes that you can access from data, until you get the array you are after.
Once you know the structure of the response
# It looks like they array is multiple levels
# inside data, so it may look like this:
issues = data[key1][key2]...[keyn]
for issue in issues:
if issue['id'] == issue_id:
...
As Pynchia points out,
if you can reach the elements of the array correctly,
i.e. i is correct,
then access summary via the fields key:
print(i['fields']['summary'])
Also, please post text rather than images.
Images can't be searched and therefore aren't useful to future readers.
You're asking us to volunteer our time for free to solve your problem, and you should make it as easy as possible for us to do so.
Why not upload images of code on SO when asking a question?
EDIT
Your question is unclear.
It is straightforward to ask for all the elements it contains:
for k, v in i['fields']:
print(f'The value of {k} is {v}.')
In your example, one of those k keys will be 'summary'.
I am parsing a .fasta-File containing one big sequence into python by using:
for rec in SeqIO.parse(faFile, "fasta"):
identifier=(rec.id)
sequence=(rec.seq)
Then, I am building a dictionary:
d={identifier:sequence}
When printing sequence only, I get the following result:
CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT
Note: All letters are printed, I made dots to shorten this
When printing the dictionary, I get:
{'NC_003047.1': Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet())}
Where does the "Seq" and the SingleLetter alphabet come from?
Desired result would be:
{'NC_003047.1':'CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT'}
Update1:
following the link in the comments, I tried
input_file=open(faFile)
d=SeqIO.to_dict(SeqIO.parse(faFile,"fasta"))
resulting in:
{'NC_003047.1': SeqRecord(seq=Seq('CAGCCAGATGGGGGGAGGGGTGAGCGCTCTCCCGCTCAAAACCTCCAGCACTTT...CAT', SingleLetterAlphabet()), id='NC_003047.1', name='NC_003047.1', description='NC_003047.1 Sinorhizobium meliloti 1021 chromosome, complete genome', dbxrefs=[])}
So, sadly, this does not help :(
Thanks in advance for your time and effort :)
SeqIO doesn't return a string, it returns an object. When you print it, you print the object's string representation, which in this case is not just the data contained in (some attribute of) the object.
(Some objects are designed so that printing the object will print just the data inside it. This depends on how the library is put together and how the programmer designed its __str__() method. This is probably not useful for you at this point, but might help you understand other related resources you find if you pursue this further.)
I'm not familiar with SeqIO but quick googling suggests you probably want
d={identifier: sequence.seq}
to put just the SeqIO object's seq attribute as the value for this identifier.
In boto3 there's a function:
ec2.instances.filter()
The documentation:
http://boto3.readthedocs.org/en/latest/reference/services/ec2.html#instance
Say it returns a list(ec2.Instance) I wish...
when I try printing the return I get this:
ec2.instancesCollection(ec2.ServiceResource(), ec2.Instance)
I've tried searching for any mention of an ec2.instanceCollection, but the only thing I found was something similar for ruby.
I'd like to iterate through this instanceCollection so I can see how big it is, what machines are present and things like that.
Problem is I have no idea how it works, and when it's empty iteration doesn't work at all(It throws an error)
The filter method does not return a list, it returns an iterable. This is basically a Python generator that will produce the desired results on demand in an efficient way.
You can use this iterator in a loop like this:
for instance in ec2.instances.filter():
# do something with instance
or if you really want a list you can turn the iterator into a list with:
instances = list(ec2.instances.filter())
I'm adding this answer because 5 years later I had the same question and went round in circles trying to find the answer.
First off, the return type in the documentation is wrong (still). As you say, it states that the return type is: list(ec2.Instance)
where it should be:ec2.instancesCollection.
At the time of writing there's an open issue in github covering this - https://github.com/boto/boto3/issues/2000.
When you call the filter method a ResourceCollection is created for the particular type of resource against which you called the method. In this case the resource type is instance which gives an instancesCollection. You can see the code for the ResourceCollection superclass of instancesCollection here:
https://github.com/boto/boto3/blob/develop/boto3/resources/collection.py
The documentation here gives an overview of the collections: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/collections.html
To get to how to use it and actually answer your question, what I did was to turn the iterator into a list and iterate over the list if the size is > 0.
testList = list(ec2.instances.filter(Filters=filters))
if len(testList) > 0;
for item in testList;
.
.
.
This may well not be the best way of doing it but it worked for me.