Read csv file titles with blanc inside - linux

I'm trying to read a the csv file :
"EDP";"Picture 1";"Picture 2";"Picture 2"
"1001480210";"T244.png";"";""
I create a script to read this files :
cd ~/86829/
while IFS=';' read "EDP" "Picture 1" "Picture 2" "Picture 3"
When i run the script, i got this error :
./RUN_CopyPictures.sh: 9: read: Picture 1: bad variable name
When i change titles to "Picture1";"Picture2";"Picture2" , and the script to while IFS=';' read EDP Picture1 Picture2 Picture3
it's work
So my question is very clear :
How can i read a title of csv file with blancs inside ?

The problem is not about reading "titles" (which is usually just the first line of the CSV file).
Instead, your problem (that causes the error) is that a line read x assigns a value to the variable named x (which can then be referenced via $x).
In your example you use read "Picture 1" which effectively tries to assign a value to the variable named Picture 1.
Since bash forbids variable names with spaces (just like probably any other non-esoteric programming language), this gives you an error.
The solution is to use legible and legit variable names:
while IFS=';' read edp pic1 fusel y
do
echo "Picture 1 is ${pic1}"
echo "Picture 2 is ${fusel}"
echo "Picture 3 is ${y}"
done
There are a number of naming-schemes for variable names, common ones include all lowercase with (or without) underscores to separate words, or CamelCase.
on "titles"
Your script doesn't know anything about what you refer to as "titles".
CSV doesn't know anything about "titles" either.
CSV is simply a format that has both lines (rows) and colums, as opposed to simpler text files that only have lines.
And just as a text-file doesn't have a notion of a "heading", a CSV file has no "titles".
Popular CSV-exporting software might, however, assign a special meaning to the very first row in a CSV-file and (ab)use it for title-information (so the content is only a label, whereas the actual column-content can be something different)
In any case, the IFS=';' read ... part of your script doesn't do anything with titles; it simply extracts multiple fields from a single line if input data and assigns these fields to variables.
The name of these variables can be totally arbitrary (as long as they conform to the bash syntax for variable names), and need not have anything to do with the content of the file.

Related

Copy CSV File with Multiline Attribute with Azure Synapse Pipeline

I have a CSV File in the Following format which want to copy from an external share to my datalake:
Test; Text
"1"; "This is a text
which goes on on a second line
and on on a third line"
"2"; "Another Test"
I do now want to load it with a Copy Data Task in an Azure Synapse Pipeline. The result is the following:
Test; Text
"1";" \"This is a text"
"which goes on on a second line";
"and on on a third line\"";
"2";" \"Another Test\""
So, yo see, it is not handling the Multi-Line Text correct. I also do not see an option to handle multiline text within a Copy Data Task. Unfortunately i'm not able to use a DataFlow Task, because it is not allowing to run with an external Azure Runtime, which i'm forced to use, due to security reasons.
In fact, i'm of course not speaking about this single test file, instead i do have x thousands of files.
My settings for the CSV File look like follows:
CSV Connector Settings
Can someone tell me how to handle this kind of multiline data correctly?
Do I have any other options within Synapse (apart from the Dataflows)?
Thanks a lot for your help
Well turns out this is not possible with a CSV File.
The pragmatic solution is to use "binary" files instead, to transfer the CSV Files and only load and transform them later on with a Python Notebook in Synapse.
You can achieve this in azure data factory by iterating through all lines and check for delimiter in each line. And then, use string manipulation functions with set variable activities to convert multi-line data to a single line.
Look at the following example. I have a set variable activity with empty value (taken from parameter) for req variable.
In lookup, create a dataset with following configuration to the multiline csv:
In foreach, where I iterate each row by giving items value as #range(0,sub(activity('Lookup1').output.count,1)). Inside for each, I have an if activity with following condition:
#contains(activity('Lookup1').output.value[item()]['Prop_0'],';')
If this is true, then I concat the current result to req variable using 2 set variable activities.
temp: #if(contains(activity('Lookup1').output.value[add(item(),1)]['Prop_0'],';'),concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],decodeUriComponent('%0D%0A')),concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],' '))
actual (req variable): #variables('val')
For false, I have handled the concatenation in the following way:
temp1: #concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],' ')
actual1 (req variable): #variables('val2')
Now, I have used a final variable to handle last line of the file. I have used the following dynamic content for that:
#if(contains(last(activity('Lookup1').output.value)['Prop_0'],';'),concat(variables('req'),decodeUriComponent('%0D%0A'),last(activity('Lookup1').output.value)['Prop_0']),concat(variables('req'),last(activity('Lookup1').output.value)['Prop_0']))
Finally, I have taken copy data activity with a sample source file with 1 column and 1 row (using this to copy our actual data).
Now, take source file configuration as shown below:
Create an additional column with value as final variable value:
Create a sink with following configuration and select mapping for only above created column:
When I run the pipeline, I get the data as required. The following is an output image for reference.

How do i replace source and target attributes name in a command using python. Have over 500 different attributes to be replaced

The variables var1 and var2 in the below command needs to be replaced and replicated with over 500 unique attributes. Complete beginner in python, suggestions ?
(Plan is to read a file with 500 attributes and loop it to replicate the command with all the different attributes found in the file to a second file or print the command output in console.)
command:
dsconfig create-proxy-transformation --transformation-name Attr-Mapping_proxy01 --type attribute-mapping --set enabled:true --set source-attribute:"var1" --set target-attribute:"var2"
file = open('open.txt')
for line in file:
fields = line.strip().split()
print fields[0], fields[1]
Assign values to variables var1 and var2 in above command with for loop with appropriate field values, it will do the rest of stuff.
considering in text file
AA11 BB11 CC11 DD11
AA22 BB22 CC22 DD22
AA33 BB44 CC44 DD33
Python is so easy :)
To be more specific, split() splits the contents of a string into fields delimited by some delimiter (by default any blank character, e.g. space, tab etc.), and returns an array with the split fields. strip() strips all blank characters from the beginning and end of a line. And a file in python is an iterable object which when iterated over by the keyword in, gives the lines in the file one by one. For more information on these, you can look at http://docs.python.org/2/library/stdtypes.html#str.split , http://docs.python.org/2/library/stdtypes.html#str.strip , http://docs.python.org/2/library/stdtypes.html#bltin-file-objects .
link : Reading a file and storing values into variables in python

Issue setting up a save path with integer variables and strings in kdb+

I am basically trying to save to data/${EPOCH_TIME}:
begin_unix_time: "J"$first system "date +%s"
\t:1 save `data/"string"$"begin_unix_time"
I am expecting it to save to data/1578377178
You do not need to cast first system "date +%s" to a long in this case, since you want to attach one string to another. Instead you can use
begin_unix_time:first system "date +%s"
to store the string of numbers:
q)begin_unix_time
"1578377547"
q)`$"data/",begin_unix_time
`data/1578377547
Here you use the comma , to join one string to another, then using cast `$ to convert the string to a symbol.
The keyword save is saving global data to a file. Given your filepath, it looks like youre trying to save down a global variable named 1578377547, and kdb can not handle variable names being purely numbers.
You might want to try saving a variable named a1578377547 instead, for example. This would change the above line to
q)`$"data/a",begin_unix_time
`data/a1578377547
and your save would work correctly, given that the global variable a1578377547 exists. Because you are sourcing the date down to the second from linux directly in the line you are saving a variable down to, this will likely not work, due to time constantly changing!
Also note that the timer system command will repeat it the execution n times (as in \t:n), meaning that the same variable will save down mutliple times given the second does not change. The time will also likely change for large n and you wont have anything assigned to the global variable you are trying to save should the second change.

Batch Script - Comma causing problems when writing numerical value to txt file

So this is an example of a script I'm writing to produce txt file containing a list of correctly formatted commands to be passed to another system, based on a long series of questions.
If objectName, objectNumber, or objectCategory are assigned an alphanumeric value, the script will write correctly to the txt file. However, if I were to assign a numerical value, the value is not written to the txt file.
I'm guessing this is a syntax issue related to the use of a comma, as I can replace it with with pretty much anything and the script will behave, though unfortunately it has to be a comma.
#ECHO OFF
SET objectCounter=1
SET /p objectName=What is the name of the object %objectCounter%?:
#ECHO OBJECTNAME%objectCounter%,%objectName%> objects.txt
SET /p objectNumber=How many of object %objectCounter% are there?:
#ECHO OBJECTNUMBER%objectCounter%,%objectNumber%>> objects.txt
SET /p objectCategory=Which group does object %objectCounter% belong to?:
#ECHO OBJECTCATEGORY%objectCounter%,%objectNumber%>> objects.txt
This is an example of the output to the txt file if objectNumber is assigned a value of 1:
OBJECTNAME1,Apple
OBJECTNUMBER1,
OBJECTCATEGORY1,Fruit
And this is an example of the output to the txt file if objectNumber is assigned a value of 2 or more:
OBJECTNAME1,Apple
OBJECTCATEGORY1,Fruit
This is a simplification: a echo command to send the text data 1 to a file:
echo data 1>somewhere.txt
Here it is easy to see that the 1 will be handled by the parser as the stream number to redirect, not data to send to the file.
But the question is not using a space, but a comma. Why the same behaviour? Because from the parser point of view, spaces, tabs, commas, semicolons, parenthesis and equals are delimiters. All these lines fail the same way (tabs omited)
echo data 1>>somewhere.txt
echo data,1>>somewhere.txt
echo data;1>>somewhere.txt
echo data=1>>somewhere.txt
echo data(1>>somewhere.txt
echo data)1>>somewhere.txt
How to handle it? It is necessary to separate the digit from the redirection, so we can change the order in the line
>somewhere.txt echo data,1
or force the parser see the separation
(echo data,1)>somewhere.txt
or, if the data is inside a variable, we can also use delayed expansion
set "n=1"
setlocal enabledelayedexpansion
echo data,!n!>somewhere.txt
Of course, we can also do
echo data,1 >>somewhere.txt
including a space between the data and the redirection, but the space will be included in the redirected data.
Another option is to reorganize the code
#ECHO OFF
SET objectCounter=1
SET /p "objectName=What is the name of the object %objectCounter%?: "
SET /p "objectNumber=How many of object %objectCounter% are there?: "
SET /p "objectCategory=Which group does object %objectCounter% belong to?: "
> objects.txt (
ECHO OBJECTNAME%objectCounter%,%objectName%
ECHO OBJECTNUMBER%objectCounter%,%objectNumber%
ECHO OBJECTCATEGORY%objectCounter%,%objectCategory%
)
It looks like it has to do with output redirection in conjunction with the comma. I think, with the comma in there, the numeric value is being bound to the redirection rather than to the thing being output.
In other words, while:
set x=1
echo xyzzy%x%>qq.txt
will work (the thing being output is xyzzy%x% with a redirection of >qq.txt), the following will not:
set x=1
echo xyzzy,%x%>qq.txt
(presumably because the thing being output is xyzzy, with a redirection operation 1>qq.txt which is the same as >qq.txt). That also explains the difference you're seeing between 1 and other numbers since 1 is standard output.
You can see a similar problem even without variable expansion:
C:\pax> echo xyzzy1>qq.txt
C:\pax> type qq.txt
xyzzy1
C:\pax> echo xyzzy,1>qq.txt
C:\pax> type qq.txt
xyzzy,
One way around it is to reorganise your components so that the numeric value cannot be tied to the redirection:
>>objects.txt ECHO OBJECTNUMBER%objectCounter%,%objectNumber%
I tend to prefer putting them at the start since using something like:
echo xyzzy >file
will actually output xyzzy and the space immediately before the >.
Modifying the lines like that (and fixing your third echo so it outputs the category rather than the number again) gives you:
What is the name of the object 1?: Apple
How many of object 1 are there?: 1
Which group does object 1 belong to?: Fruit
with the resultant file being:
OBJECTNAME1,Apple
OBJECTNUMBER1,1
OBJECTCATEGORY1,Fruit

Copy value of 1 variable into another variable in UNIX

New to Unix not aware of the syntax structure so please excuse my syntax brevity.I am trying to copy a value of a variable and store that in another variable eg:
Two variables:
abc
bcd
Given:
abc=123
I want to copy the contents of abc i.e 123 in bcd. How to achieve this in Unix?
Earlier I was trying to copy the contents of abc in a .txt file which was working for me: see the code snippet below:
abc='123'
echo $abc >>/data/test/tt.txt
But know I want to copy them in another variable so I tried to do the following but was of no success.
abc='123'
test=`echo $abc>>bcd`
echo $test
Can you assist me in this?
Easy:
bcd="$abc"
For example:
abc="hello world"
The quotes there are necessary or else it will try to run a command named world with abc in its environment.
Actually, the quotes are not necessary (thanks to 1_CR for pointing this), but I like to add them for readability:
bcd=$abc
bcd="$abc"
They both do the same, exactly what you need.
Lastly, do not use single quotes, or else you will not get the value of the variable:
bcd='$abc'
Error! Now your bcd variable contains the literal value $abc.

Resources