Script for Automation to pull from CSV - python-3.x

hope all is well.
I'm here working a script for automation to pull data from a csv and post to an api.
However how I dump the csv it is stored with [] and I would like to remove them before submitting.
df = pd.read_csv('test1.csv')
for index, data in df.iterrows():
payload =json.dumps({"adapters": data ["MAC Addresses"], "typeLabel": "Windows","iconType": "windows","operatingSystem": "Windows", "hostName": data ["Endpoint Name"],"role": "Staff"})
print (payload)
This is my script, where adapters is, that's the data I want to manipulate before posting.
In csv file, the data save as this:
['08:92:04:0a:ec:00']
whenever the code runs, it shows as this
"['08:92:04:0a:ec:00']"
But I want it to be like this ["08:92:04:0a:ec:00"], the API only accepts this.
Is there anyway this can be accomplish, much apperciation.
I tried everything, only solution is to learn from the experts.

Related

I could not write the data frame including twitter hashtag to the csv file. When I subset some of the variables, csv file does not show variables

I extracted the twitter data to the R script via packages (rtweet) and (tidyverse) through spesific hashtag. After sucessfully get the data, I need to subset some of the variables that I want to analyze. I code the subset function and console shows the subsetted variables. Despite this, when I tried to write this to the csv file, written csv shows the whole variables instead to show only subsetted variables. Codes that I entered as follows.
twitter_data_armenian_issue_iki <- search_tweets("ermeniler", n=1000, include_rts = FALSE)
view(twitter_data_armenian_issue_iki)
twitter_data_armenian_issue_iki_clean <- cbind(twitter_data_armenian_issue_iki, users_data(twitter_data_armenian_issue_iki)[,c("id","id_str","name", "screen_name")])
twitter_data_armenian_issue_iki_clean <-twitter_data_armenian_issue_iki_clean[,! duplicated(colnames(twitter_data_armenian_issue_iki_clean))]
view(twitter_data_armenian_issue_iki_clean)
data_bir <-data.frame(twitter_data_armenian_issue_iki_clean)
data_bir[ , c("created_at", "id", "id_str", "full_text", "name", "screen_name", "in_reply_to_screen_name")]
write.csv(data_bir, "newdata.csv", row.names = FALSE, )
If anyone want to help me, I will be more pleased. Thank you
I tried to get twitter data with only some spesific columns that I want to analyze. In order to do this, I entried the subset function and run. But when I tried to write this to the csv, written csv file shows the wole variable. I controlled the environment panel, I could not see the subsetted data.
My question is How can I add the subsetted data to the environment and write this to the csv without any error.

How to include SQLAlchemy Data types in Python dictionary

I've written an application using Python 3.6, pandas and sqlalchemy to automate the bulk loading of data into various back-end databases and the script works well.
As a brief summary, the script reads data from various excel and csv source files, one at a time, into a pandas dataframe and then uses the df.to_sql() method to write the data to a database. For maximum flexibility, I use a JSON file to provide all the configuration information including the names and types of source files, the database engine connection strings, the column titles for the source file and the column titles in the destination table.
When my script runs, it reads the JSON configuration, imports the specified source data into a dataframe, renames source columns to match the destination columns, drops any columns from the dataframe that are not required and then writes the dataframe contents to the database table using a call similar to:
df.to_sql(strTablename, con=engine, if_exists="append", index=False, chunksize=5000, schema="dbo")
The problem I have is that I would like to also specify the data types in the df.to_sql method for columns and provide them as inputs from the JSON configuration file however, this doesn't appear to be possible as all the strings in the JSON file need to be be enclosed in quotes and they don't then translate when read by my code. This is how the df.to_sql call should look:
df.to_sql(strTablename, con=engine, if_exists="append", dtype=dictDatatypes, index=False, chunksize=5000, schema="dbo")
The entries that form the dtype dictionary from my JSON file look like this:
"Data Types": {
"EmployeeNumber": "sqlalchemy.types.NVARCHAR(length=255)",
"Services": "sqlalchemy.types.INT()",
"UploadActivities": "sqlalchemy.types.INT()",
......
and there a many more, one for each column.
However, when the above is read in as a dictionary, which I pass to the df.to_sql method, it doesn't work as the alchemy datatypes shouldn't be enclosed in quotes but, I can't get around this in my JSON file. The dictionary values therefore aren't recognised by pandas. They look like this:
{'EmployeeNumber': 'sqlalchemy.types.INT()', ....}
And they really need to look like this:
{'EmployeeNumber': sqlalchemy.types.INT(), ....}
Does anyone have experience of this to suggest how I might be able to have the sqlalchemy datatypes in my configuration file?
You could use eval() to convert the string names to objects of that type:
import sqlalchemy as sa
dict_datatypes = {"EmployeeNumber": "sa.INT", "EmployeeName": "sa.String(50)"}
pprint(dict_datatypes)
"""console output:
{'EmployeeName': 'sa.String(50)', 'EmployeeNumber': 'sa.INT'}
"""
for key in dict_datatypes:
dict_datatypes[key] = eval(dict_datatypes[key])
pprint(dict_datatypes)
"""console output:
{'EmployeeName': String(length=50),
'EmployeeNumber': <class 'sqlalchemy.sql.sqltypes.INTEGER'>}
"""
Just be sure that you do not pass untrusted input values to functions like eval() and exec().

Copying CSV data to a JSON array object in Azure Data Factory

I've been going round in circles trying to get what I thought would be a relatively trivial pipeline working in Azure Data Factory. I have a CSV file with a schema like this:
Id, Name, Color
1, Apple, Green
2, Lemon, Yellow
I need to transform the CSV into a JSON file that looks like this:
{"fruits":[{"Id":"1","Name":"Apple","Color":"Green"},{"Id":"2","Name":"Lemon","Color":"Yellow"}]
I can't find a simple example that helps me understand how to do this in ADF. I've tried a Copy activity, and a data flow, but the furthest I've got is a json object like this:
{"fruits":{"Id":"1","Name":"Apple","Color":"Green"}}
{"fruits":{"Id":"2","Name":"Lemon","Color":"Yellow"}}
Surely this is simple to achieve. I'd be very grateful if anyone has any suggestions. Thanks!
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#tabularhierarchical-source-to-hierarchical-sink
"When copying data from tabular source to hierarchical sink, writing to array inside object is not supported"
But, if we put file pattern under Sink properties as 'Array of Objects', you can achieve somewhere till here:
[{"Id":"1","Name":" Apple","Color":" Green"}
,{"Id":"2","Name":" Lemon","Color":" Yellow"}
]

Can AWS Lambda write CSV to response?

like the question says, I would like to know if it is possible to return the response request of a lambda function in CSV format. I already know that is possible to write JSON objects as such, but for my current project, CSV format is necessary. I have only seen discussion of writing CSV files to S3, but that is what we need for this project.
This is an example of what I would like to have displayed in a response:
year,month,day,hour
2017,10,11,00
2017,10,11,01
2017,10,11,02
2017,10,11,03
2017,10,11,04
2017,10,11,05
2017,10,11,06
2017,10,11,07
2017,10,11,08
2017,10,11,09
Thanks!

Parse a huge JSON file

I have a very large JSON file (about a gigabyte) which I want to parse.
I tried the JsonSlurper, but it looks like it tries to load the whole file into memory which causes out of memory exception.
Here is a piece of code I have:
def parser = new JsonSlurper().setType(JsonParserType.CHARACTER_SOURCE);
def result = parser.parse(new File("equity_listing_full_201604160411.json"))
result.each{
println it.Listing.ID
}
And Json is something like this but much longer with more columns and rows
[
{"Listing": {"ID":"2013056","VERSION":"20160229:053120:000","NAME":"XXXXXX","C_ID":["1927445"],}},
{"Listing": {"ID":"2013057","VERSION":"20160229:053120:000","NAME":"XXXXXX","C_ID":["1927446"],}},
{"Listing": {"ID":"2013058","VERSION":"20160229:053120:000","NAME":"XXXXXX","C_ID":["1927447"],}}
]
I want to be able to read it row by row. I can probably just parse each row separately, but was thinking that there might be something for parsing as you read.
Suggest using GSON by Google.
There is a streaming Parsing Option here: https://sites.google.com/site/gson/streaming

Resources