Postgis merge and order of linestrings - kml

I've imported a KML file in my database postgis. When I select a road I've the right result:
This is one road. There aren't a lot of rows so I could order them manually but some roads have more then 100.
So I would like to order the linestrings at the import.
I have this :
1 - I would like to merge the linestrings. It's possible with ST_Union to do this but if I do it now, the result is very strange that's why I have to order the lines.
2 - So I have to order the linestrings, that's why I have a column position in my table. I know how to get the end and the first point of a linestring.
When I do this :
SELECT ST_AsText(ST_ClosestPoint(ST_GeomFromText('POINT(7.38770714271048 47.5497446465569)',4326),geometrie)),
ST_AsText(ST_ClosestPoint(geometrie,ST_GeomFromText('POINT(7.38770714271048 47.5497446465569)',4326)))
FROM sections
WHERE nom_voie = 'LA THERMALE';
7.38770714271048 47.5497446465569 is the endpoint
It returns all rows of the road LA THERMALE.
Is there another solution to merge the linestrings maybe without ordering ?
When I concatenate the linestrings the result is false: it relate the endpoint of the Line 1 to the start point of the Line 4 etc. I think it's because they aren't ordered.

Try using using the ST_Collect to aggregate the line pieces into a MULTILINESTRING (hopefully), then use ST_LineMerge to sew them together.
SELECT nom_voie, ST_LineMerge(ST_Collect(geometrie))
FROM sections
WHERE nom_voie = 'LA THERMALE'
GROUP BY nom_voie;
For example, with a MULTILINESTRING, same as your figure:
SELECT ST_AsText(ST_LineMerge('
MULTILINESTRING ((27 215, 140 170),
(230 210, 330 170),
(230 210, 140 170),
(330 170, 380 230))'));
st_astext
----------------------------------------------------
LINESTRING(27 215,140 170,230 210,330 170,380 230)
(1 row)
So from this, it doesn't appear ordering or even direction matters.

Related

R DiagrammeR package Mermaid text using actual calculation results

I would like to utilize the DiagrammeR package for a simple flow chart in my Rmarkdown. However, I couldn't figure out a way to use actual output from a data table into the text. Suppose I have a simple query of a database with total records, patients count and date in year info for three different cohorts.
I wanted to create a diagram using Mermaid. The codes look at this.
Total = paste0('Records:',b1$records,' Patients:',b1$patients,' Year:',b1$year)
# (Records:1000 Patients:822 Year:5)
Sub1 = paste0('Records:',b2$records,' Patients:',b2$patients,' Year:',b2$year)
Sub2 = paste0('Records:',b3$records,' Patients:',b3$patients,' Year:',b3$year)
mermaid("
graph TB
A[Total] --> B{Sub1} --> C{Sub2}
")
Instead of Printing out diagram with: Records:1000 Patients:822 Year:5 in the A, it shows verbatim word "Total".
Any suggestion on how to do it correctly?
Thanks!
You are one step away from what you'd like to achieve. Please try this simple example below to see the logic:
library(DiagrammeR)
Stracture:
DiagrammeR(
"
graph TB
A[Question] -->B[Answer]
"
)
1. Define answer node:
B <- paste0("There are ", nrow(iris), " records")
2. Combine it with other components, using ; to separate statements:
results <- paste0("graph TB; A[How many rows does iris have?]-->", "B[", B, "]")
3. Call 'results' in DiagrammeR:
DiagrammeR(diagram = results)
The final plot should refresh when your calculation updates.
The plot that calls your calculation

Is there a way to randomly select multiple values from an array in python?

For example, in making a text based game, I'm trying to select a few countries out of a whole array, like only a few not the whole thing, but it needs to be random and different every time. I'll try display it in pseudocode:
Import random
nations = [UK, USA, France, Spain, Germany, Russia, Sweden, Norway, Austria, Turkey, KSA, UAE,
India, PRC, Japan, Mongolia, Kyrgyzstan, Egypt, Algeria, Morocco, Nigeria, Ghana, Laos,
Vietnam, Cambodia, Congo, Kenya, Somali, Sudan]
nationsForThisGame = nations.random(9)
// This gives me 9 random nations from that array
output(F"You are allies with {nationsForThisGame(1)}")
// This means that from the second array called nationsForThisGame, the first nation you're
allies with
Now how do I make this into python? I tried this similar structure, but it says: 'list' object has no attribute 'random'
So from the first array, I just want a few random values to put into my second array. How do I do that?
Use the following:
Nations = ["Nation1", "Nation2", "Nation3", "Nation4", "Nation5", "Nation6",
"Nation7", "Nation8", "Nation9"]
print(random.sample(Nations, 6))
It works good.
This means that from the array 'Nations', you could randomly choose as many values as you want. This helped me in a text-based game I'm making.

ADLA Job: Write To Different Files Based On Line Content

I have a BUNCH of fixed width text files that contain multiple transaction types with only 3 that I care about (121,122,124).
Sample File:
D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798
D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00
D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200
So What I need to do is read line by line from these files and look for the ones that have a 121, 122, or 124 at startIndex = 9 and length = 3.
Each line needs to be parsed based on a data dictionary I have and the output needs to be grouped by transaction type into three different files.
I have a process that works but it's very inefficient, basically reading each line 3 times. The code I have is something like this:
#121 = EXTRACT
col1 string,
col2 string,
col3 string //ect...
FROM inputFile
USING new MyCustomExtractor(
new SQL.MAP<string, string> {
{"col1","2"},
{"col2","6"},
{"col3","3"} //ect...
};
);
OUTPUT #121
TO 121.csv
USING Outputters.Csv();
And I have the same code for 122 and 124. My custom extractor takes the SQL MAP and returns the parsed line and skips all lines that don't contain the transaction type I'm looking for.
This approach also means I'm running through all the lines in a file 3 times. Obviously this isn't as efficient as it could be.
What I'm looking for is a high level concept of the most efficient way to read a line, determine if it is a transaction I care about, then output to the correct file.
Thanks in advance.
How about pulling out the transaction type early using the Substring method of the String datatype? Then you can do some work with it, filtering etc. A simple example:
// Test data
#input = SELECT *
FROM (
VALUES
( "D103421612100188300000300000000012N000002000001000032021420170012260214201700122600000000059500000300001025798" ),
( "D103421612200188300000300000000011000000000010000012053700028200004017000000010240000010000011NNYNY000001000003N0000000000 00" ),
( "D1034216124001883000003000000000110000000000300000100000000000CS00000100000001200000033NN0 00000001200" ),
( "D1034216999 0000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000" )
) AS x ( rawData );
// Pull out the transaction type
#working =
SELECT rawData.Substring(8,3) AS transactionType,
rawData
FROM #input;
// !!TODO do some other work here
#output =
SELECT *
FROM #working
WHERE transactionType IN ("121", "122", "124"); //NB Note the case-sensitive IN clause
OUTPUT #output TO "/output/output.csv"
USING Outputters.Csv();
As of today, there is no specific U-SQL function that can define the output location of a tuple on the fly.
wBob presented an approach to a potential workaround. I'd extend the solution the following way to address your need:
Read the entire file, adding a new column that helps you identify the transaction type.
Create 3 rowsets (one for each file) using a WHERE statement with the specific transaction type (121, 122, 124) on the column created in the previous step.
Output each rowset created in the previous step to their individual file.
If you have more feedback or needs, feel free to create an item (and voting for others) on our UserVoice site: https://feedback.azure.com/forums/327234-data-lake. Thanks!

Reordering data by manipulating column wise in Python

I have data in a csv file as follows:
60,27702,1938470,13935,18513,8
60,32424,1933740,16103,15082,11
60,20080,1946092,9335,14970,2
60,28236,1937936,13799,16871,6
60,22717,1943455,10809,16726,4
120,37702,2938470,23935,28513,8
120,42424,2933740,26103,25082,11
120,30080,2946092,2335,24970,2
120,38236,2937936,23799,26871,6
120,32717,2943455,20809,26726,4
180,47702,3938470,33935,8513,8
180,52424,3933740,36103,5082,11
180,40080,3946092,3335,4970,2
180,48236,3937936,33799,6871,6
180,42717,3943455,30809,6726,4
I then used the following code to insert column heading:
df = pd.read_csv("contikiMAC_new_out.csv", names=['Energest','CPU','LPM','Transmit','Listen','ID'])
I used df.groupby(['ID']) to see the data in group according to column 'ID'.
The problem is the data in column 'LPM' gets reset after some time so I would like to add the previous value with the new value whenever the new value in LPM column is smaller for specific 'ID' .
I tried doing :
for x in df.groupby(['ID']):
for i in df.ID:
if (df.loc[i, 'LPM'] < df.loc[i - 1, 'LPM']):
df.loc[i, 'LPM'] = df.loc[i, 'LPM'] + df.loc[i - 1, 'LPM']
But actually not getting the fruitful result I desire because it mixes with the 'LPM' value of different 'ID' and the process takes a long time. Can anyone please help me in suggesting a way to write the data group wise in a csv file based on 'ID' after performing the sum operation ?
The data structure I like to see is as follows:
60,27702,1938470,13935,18513,8
120,37702,2938470,23935,28513,8
180,47702,3938470,33935,37026,8
60,32424,1933740,16103,15082,11
120,42424,2933740,26103,25082,11
180,52424,3933740,36103,30164,11
60,20080,1946092,9335,14970,2
120,30080,2946092,2335,24970,2
180,40080,3946092,3335,29940,2
60,28236,1937936,13799,16871,6
120,38236,2937936,23799,26871,6
180,48236,3937936,33799,33742,6
60,22717,1943455,10809,16726,4
120,32717,2943455,20809,26726,4
180,42717,3943455,30809,33452,4
If I understood your problem correctly, DataFrame.shift is what you're looking for.
Something like:
df['LPM_prev'] = df.groupby(['ID'])['LPM'].shift(1)
And then you can work with that column

Fortran output on two lines instead of one

I'm running a fortran 90 program that has an array of alpha values with i=1 to 40. I'm trying to output the array into 5 rows of 8 using the code below:
write(4,*) "alpha "
write(4,*)alpha(1), alpha(2), alpha(3), alpha(4), alpha(5), alpha(6), alpha(7), alpha(8)
write(4,*)alpha(9), alpha(10), alpha(11), alpha(12), alpha(13), alpha(14), alpha(15), alpha(16)
write(4,*)alpha(17), alpha(18), alpha(19), alpha(20), alpha(21), alpha(22), alpha(23), alpha(24)
write(4,*)alpha(25), alpha(26), alpha(27), alpha(28), alpha(29), alpha(30), alpha(31), alpha(32)
write(4,*)alpha(33), alpha(34), alpha(35), alpha(36), alpha(37), alpha(38), alpha(39), alpha(40)
where 4 is the desired output file. But when I open the output, there are 10 rows instead of 5 each with 5 values then 3 values alternating. Any idea what I can do to avoid this?
Thanks.
Use formatted IO. List-directed IO (i.e., with "*") is designed to be easy but is not fully specified. Different compilers will produce different output. Try something such as:
write (4, '( 8(2X, ES14.6) )' ) alpha (1:8)
Or use a loop:
do i=1, 33, 8
write (4, '( 8(2X, ES14.6) )' ) alpha (i:i+7)
end do
write (4,"(8(1x,f0.4))") alpha
prints the 40 numbers over 5 lines, because in Fortran "format reversion" the format is re-used when you reach the end of it, with further data printed on a new line.
The site http://www.obliquity.com/computer/fortran/format.html says this about format reversion:
"If there are fewer items in the data transfer list than there are data descriptors, then all of the unused descriptors are simply ignored. However, if there are more items in the data transfer list than there are data descriptors, then forced reversion occurs. In this case, FORTRAN 77 advances to the next record and rescans the format, starting with the right-most left parenthesis, including any repeat-count indicators. It then re-uses this part of the format. If there are no inner parenthesis in the FORMAT statement, then the entire format is reused."

Resources