I would like to write a post-processor in order to open some flow field data in paraview (using vtk legacy format). I am fine with the mesh loading, but I have a question on the variables arrangement.
I need to put a value in every cell center and not in the cell nodes. Thus, I have one value for each cell and no way to have a value for each node. Do you know a way to fix this problem?
Thank you very much for your kind help
Sure, you can specify cell data in the legacy ASCII VTK file format. Here's a simple example of a rectilinear grid with two cell data arrays with vector elements:
# vtk DataFile Version 2.0
ASCII
DATASET RECTILINEAR_GRID
DIMENSIONS 4 2 2
X_COORDINATES 4 double
0.0 10.0 20.0 30.0
Y_COORDINATES 2 double
0.0 10.0
Z_COORDINATES 2 double
0.0 10.0
CELL_DATA 3
VECTORS first_array double
-1.0 0.0 0.0
0.0 1.0 0.0
1.0 0.0 0.0
VECTORS second_array double
-1.0 0.0 0.0
0.0 1.0 0.0
1.0 0.0 0.0
Related
This is a sample of the dataset I have using the following piece of code
ComplaintCity = nyc_df.groupby(['City','Complaint Type']).size().sort_values().unstack()
top5CitiesByComplaints = ComplaintCity[top5Complaints].rename_axis(None, axis=1)
top5CitiesByComplaints
Blocked Driveway Illegal Parking Noise - Street/Sidewalk Noise - Commercial Derelict Vehicle
City
ARVERNE 35.0 58.0 29.0 2.0 27.0
ASTORIA 2734.0 1281.0 500.0 1554.0 363.0
BAYSIDE 377.0 514.0 15.0 40.0 198.0
BELLEROSE 95.0 106.0 13.0 37.0 89.0
BREEZY POINT 3.0 15.0 1.0 4.0 3.0
BRONX 12754.0 7859.0 8890.0 2433.0 1952.0
BROOKLYN 28147.0 27461.0 13354.0 11458.0 5179.0
CAMBRIA HEIGHTS 147.0 76.0 25.0 12.0 115.0
CENTRAL PARK NaN 2.0 95.0 NaN NaN
COLLEGE POINT 435.0 352.0 33.0 35.0 184.0
CORONA 2761.0 660.0 238.0 248.0
I want to be able to plot the same as a horizontal bar chart for each complaint. It should display the Cities with the highest count of complaints. Something similar to the image below. I am not sure how to go about it.
You can create a list of axis instances with subplots and plot the columns one-by-one:
fig, axes = plt.subplots(3,2,figsize=(10,6))
for c,ax in zip(df.columns, axes.ravel()):
df[c].sort_values().plot.barh(ax=ax)
fig.tight_layout()
Then you would get something like this:
Simply i want when i subtract/division operation with null value it will give the value(digit).Ex - 3/np.nan = 3 or 2-np.nan = 2.
By using np.nansum and np.nanprod i have handled addition and multiplication,but dont know how will i do operation for subtraction and division.
df = pd.DataFrame({"a":[1,2,3,4],"b":[1,2,np.nan,np.nan]})
df
Out[6]:
a b c=a-b d=a/b
0 1 1.0 0.0 1.0
1 2 2.0 0.0 1.0
2 3 NaN 3.0 3.0
3 4 NaN 4.0 4.0
Above i mention that actually what i am looking for.
#Use fill value of 0 for subtraction operation
df['c']=df.a.sub(df.b,fill_value=0)
#Use fill value of 1 for division operation
df['d']=df.a.div(df.b,fill_value=1)
IIUC using sub with fill_value
df.a.sub(df.b,fill_value=0)
Out[251]:
0 0.0
1 0.0
2 3.0
3 4.0
dtype: float64
Here's my original dataframe with NaN values which I'm trying to fill;
https://prnt.sc/i40j33
If I use df.interpolate(axis=1) to fill up the NaN values, only some of the rows fill up properly with a number.
For e.g
https://prnt.sc/i40mgq
As you can see in the screenshot column:1981 and row:3 which had a NaN value has filled up properly with a value other than NaN. I want to fill the rest of NaN as well like that? Any idea how do I do that?
Using DataFrame.interpolate()
In your case it is failing because there are no columns to the left, and therefore the interpolate method doesn't know what to interpolate it to: missing_value = (left_value + right_value)/2
So you could, for example, insert a column to the left with all 0's (if you would like to impute your missing values on the first column with half of the next value), as such:
df.insert(loc=0, column='allZeroes', value=0)
After this, you could interpolate as you are doing and remove the column
General missing value imputation
Either use df.fillna('DEFAULT-VALUE') as Alex mentioned in the comments to the question. Docs here
or do something like:
df.my_col[df.my_col.isnull()] = 'DEFAULT-VALUE'
I'd recommend using the fillna as you can use methods such as forward fill (ffill) -- impute the missings with the previous value -- and other similar methods.
It seems like you might want to interpolate on axis=0, column-wise:
>>> df = pd.DataFrame(np.arange(35, dtype=float).reshape(5,7),
columns=[1951, 1961, 1971, 1981, 1991, 2001, 2001],
index=range(0, 5))
>>> df.iloc[1:3, 0] = np.nan
>>> df.iloc[3, 3] = np.nan
>>> df.interpolate(axis=0)
1951 1961 1971 1981 1991 2001 2001
0 0.0 1.0 2.0 3.0 4.0 5.0 6.0
1 7.0 8.0 9.0 10.0 11.0 12.0 13.0
2 14.0 15.0 16.0 17.0 18.0 19.0 20.0
3 21.0 22.0 23.0 24.0 25.0 26.0 27.0
4 28.0 29.0 30.0 31.0 32.0 33.0 34.0
Currently you're interpolating row-wise. NaNs that "begin" a Series aren't padded by a value on either side, making interpolation impossible for them.
Update: pandas is adding some more optionality for this in v 0.23.0.
My code is returning 1000 snapshot_XXXX.dat files (XXXX = 0001, 0002,...). They are two columns data files that take a picture of the system I am running at a specific time. I would like to mix them in the order they are created to build a 2D plot (or heatmap) that will show the evolution of the quantity I am following over time.
How can I do this using gnuplot?
Assuming you want the time axis going from bottom to top, you could try the following:
n=4 # Number of snapshots
set palette defined (0 "white", 1 "red")
unset key
set style fill solid
set ylabel "Snapshot/Time"
set yrange [0.5:n+0.5]
set ytics 1
# This functions gives the name of the snapshot file
snapshot(i) = sprintf("snapshot_%04d.dat", i)
# Plot all snapshot files.
# - "with boxes" fakes the heat map
# - "linecolor palette" takes the third column in the "using"
# instruction which is the second column in the datafiles
# Plot from top to bottom because each boxplot overlays the previous ones.
plot for [i=1:n] snapshot(n+1-i) using 1:(n+1.5-i):2 with boxes linecolor palette
This example data
snapshot_0001.dat snapshot_0002.dat snapshot_0003.dat snapshot_0004.dat
1.0 0.0 1.0 0.0 1.0 0.0 1.0 0.0
1.5 0.0 1.5 0.0 1.5 0.0 1.5 0.0
2.0 0.5 2.0 0.7 2.0 0.7 2.0 0.7
2.5 1.0 2.5 1.5 2.5 1.5 2.5 1.5
3.0 0.5 3.0 0.7 3.0 1.1 3.0 1.5
3.5 0.0 3.5 0.0 3.5 0.7 3.5 1.1
4.0 0.0 4.0 0.0 4.0 0.0 4.0 0.7
4.5 0.0 4.5 0.0 4.5 0.0 4.5 0.0
5.0 0.0 5.0 0.0 5.0 0.0 5.0 0.0
results in this image (tested with Gnuplot 5.0):
You can change the order of the plots if you want to go from top to bottom. If you want to go from left to right, maybe this can help (not tested).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Simple question, why do object files contain normals, you can just calculate the normals right?
If I'm correct I'd just have to take the crossproduct between the vector point1-point2 and point1-point3, so which would save me the time of reading them from a file.
EDIT:
Trying to be more specific, this is a file I've found and which I want to use:
g cube
v 0.0 0.0 0.0
v 0.0 0.0 1.0
v 0.0 1.0 0.0
v 0.0 1.0 1.0
v 1.0 0.0 0.0
v 1.0 0.0 1.0
v 1.0 1.0 0.0
v 1.0 1.0 1.0
vn 0.0 0.0 1.0
vn 0.0 0.0 -1.0
vn 0.0 1.0 0.0
vn 0.0 -1.0 0.0
vn 1.0 0.0 0.0
vn -1.0 0.0 0.0
f 1//2 7//2 5//2
f 1//2 3//2 7//2
f 1//6 4//6 3//6
f 1//6 2//6 4//6
f 3//3 8//3 7//3
f 3//3 4//3 8//3
f 5//5 7//5 8//5
f 5//5 8//5 6//5
f 1//4 5//4 6//4
f 1//4 6//4 2//4
f 2//1 6//1 8//1
f 2//1 8//1 4//1
EDIT 2:
Because people complained:
http://en.wikipedia.org/wiki/Wavefront_.obj_file
you can calculate normals, but it takes time to compute them. When you have a lot of meshes and have to render at 60 fps (or more), its more performant to load precomputed normals into the GPU. also crossproduct between the vector point1-point2 and point1-point3, just gives the face normal. to get the per vertex normals that are required for Goraud shading, you have to average the face normals at every vertex. so you can see the computation gets deeper.