Vega-Lite: Excessive padding when generating a chart with `vl2svg` - svg

I'm using vega-lite's vl2svg CLI helper to headlessly generate chart SVGs on a Heroku server. This works great for the most part, but I'm noticing that the generated SVGs have way too much padding on the left side, to the left of the labels, and the amount of "excess padding" grows as the labels get longer.
For example, here's a simple bar chart with 3 bars and 3 labels. In the online Vega-Lite editor, the chart renders correctly as pictured below (ie. no excessive padding). But when I generate the same chart via vl2svg on my local computer (using the same package versions - Vega 5.21.0, Vega-Lite 5.2.0) the chart has around 100px of excess padding on the left side before the labels begin.
The chart's VL spec:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [{"category":"Male","count":112},{"category":"Female","count":122},{"category":"Other (eg. non-binary)","count":31}]
},
"encoding": {
"y": {"field": "category", "sort": null, "title": "", "type": "nominal", "axis": {}},
"x": {"field": "count", "title": "", "type": "quantitative", "axis": {}}
},
"layer": [
{"mark": "bar"},
{
"mark": {"type": "text", "dx": 3, "dy": 0, "xOffset": 0, "yOffset": 0, "align": "left", "baseline": "middle", "limit": 20, "fontSize": 8},
"encoding": {"text": {"field": "count"}}
}
]
}
Steps I used to generate the svg headlessly (on MacOS 10.15.7, in case the OS matters):
npm install vega-lite#5.2.0
write the above spec to test.json
node_modules/vega-lite/bin/vl2svg test.json > test.svg
The SVG chart rendered by the online editor (I added the red border to highlight the difference in padding):
The SVG chart rendered by vl2svg run locally:
Why does this difference in padding / spacing exist? What can I do to make the headlessly-generated svg render identically to what the online editor generates? Is this a bug? or user error on my end?
NOTE: The excessive spacing mostly disappears when the labels are uniformly short. For example if I reword that third label to "Other", then re-render the svg using vl2svg, there's only ~5px of excessive left padding, as shown below:

Related

I have a problem with kibana pie chart's colors palette

I couldn't find how to change the colors in pie charts and treemaps in my kibana's dashboard.
Does anyone know how to change the default colors of a pie chart in kibana 7.10.2 dashboards please ?
Pie chart color can be picked in edit mode of concrete visualization, after clicking on legend. To show legend click bottom left icon.
Im unaware of default scheme cross dashboard, however maybe if you export dashboard into json, then json can be edited and reimported again, instead of using UI mentioned above. Color is stored in property uiStateJSON like this:
"uiStateJSON":"{\"vis\":{\"colors\":{\"Count\":\"#629E51\",\"passed\":\"#508642\",\"failed\":\"#BF1B00\",\"pending\":\"#E5AC0E\"}}}"
In complex scenarios Vega visualization can be used, where common scheme can be defined with scale property.
"encoding": {
"theta": {"field": "result_count", "type": "quantitative"},
"color": {
"field": "result_count",
"type": "nominal",
"scale": { "scheme": "reds" }
}
}
or defined range scale property: "scale": { "range": ["#FF5050","#50FF50"] }
or merged color dimension with visualized data like it is done here

Data is getting embedded via a local json file

I'm trying to plot some data, that data is in a pandas dataframe cdfs:
alt.Chart(cdfs).mark_line().encode(
x = alt.X('latency:Q', scale=alt.Scale(type='log'), axis=alt.Axis(format="", title='Response_time (ms)')),
y = alt.Y('percentile:Q', axis=alt.Axis(format="", title='Cumulative Fraction')),
color='write_size:N',
)
The issue is that when viewing the source of the resultant plot there is just a url to a json file. That json file can't be found and hence the plots are appearing to be blank (no data).
{
"config": {"view": {"continuousWidth": 400, "continuousHeight": 300}},
"data": {
"url": "altair-data-78b044f23db74f7d4408fba9f31b9ea9.json",
"format": {"type": "json"}
},
"mark": "line",
"encoding": {
"color": {"type": "nominal", "field": "write_size"},
"x": {
"type": "quantitative",
"axis": {"format": "", "title": "Response_time (ms)"},
"field": "latency",
"scale": {"type": "log"}
},
"y": {
"type": "quantitative",
"axis": {"format": "", "title": "Cumulative Fraction"},
"field": "percentile"
}
},
"$schema": "https://vega.github.io/schema/vega-lite/v4.8.1.json"
}
This code was previously working (displaying the data on the chart) however I restarted the jupyterlab server its running on between now and then.
Hence I'm wondering why the data is getting embedded via a url rather than directly all of a sudden?
At some point in your session, you must have run
alt.data_transformers.enable('json')
If you want to restore the default data transformer which embeds data directly into the chart, run
alt.data_transformers.enable('default')

Creating wordclouds with Altair

How do I create a wordcloud with Altair?
Vega and vega-lite provide wordcloud functionality which I have used succesfully in the past.
Therefore it should be possible to access it from Altair if I understand correctly and
I would prefer to prefer to express the visualizations in Python rather than embedded JSON.
All the examples for Altair I have seen involve standard chart types like
scatter plots and bar graphs.
I have not seen any involving wordclouds, networks, treemaps, etc.
More specifically how would I express or at least approximate the following Vega visualization in Altair?
def wc(pages, width=2**10.5, height=2**9.5):
return {
"$schema": "https://vega.github.io/schema/vega/v3.json",
"name": "wordcloud",
"width": width,
"height": height,
"padding": 0,
"data" : [
{
'name' : 'table',
'values' : [{'text': pg.title, 'definition': pg.defn, 'count': pg.count} for pg in pages)]
}
],
"scales": [
{
"name": "color",
"type": "ordinal",
"range": ["#d5a928", "#652c90", "#939597"]
}
],
"marks": [
{
"type": "text",
"from": {"data": "table"},
"encode": {
"enter": {
"text": {"field": "text"},
"align": {"value": "center"},
"baseline": {"value": "alphabetic"},
"fill": {"scale": "color", "field": "text"},
"tooltip": {"field": "definition", "type": "nominal", 'fontSize': 32}
},
"update": {
"fillOpacity": {"value": 1}
},
},
"transform": [
{
"type": "wordcloud",
"size": [width, height],
"text": {"field": "text"},
#"rotate": {"field": "datum.angle"},
"font": "Helvetica Neue, Arial",
"fontSize": {"field": "datum.count"},
#"fontWeight": {"field": "datum.weight"},
"fontSizeRange": [2**4, 2**6],
"padding": 2**4
}
]
}
],
}
Vega(wc(pages))
Altair's API is built on the Vega-Lite grammar, which includes only a subset of the plot types available in Vega. Word clouds cannot be created in Vega-Lite, so they cannot be created in Altair.
With mad respect to #jakevdp, you can construct a word cloud (or something word cloud-like) in altair by recognizing that the elements of a word cloud chart involve:
a dataset of words and their respective quantities
text_marks encoded with each word, and optionally size and or color based on quantity
"randomly" distributing the text_marks in 2d space.
One simple option to distribute marks is to add an additional 'x' and 'y' column to data, each element being a random sample from the range of your chosen x and y domain:
import random
def shuffled_range(n): return random.sample(range(n), k=n)
n = len(words_and_counts) # words_and_counts: a pandas data frame
x = shuffled_range(n)
y = shuffled_range(n)
data = words_and_counts.assign(x=x, y=y)
This isn't perfect as it doesn't explicitly prevent word overlap, but you can play with n and do a few runs of random number generation until you find a layout that's pleasing.
Having thus prepared your data you may specify the word cloud elements like so:
base = alt.Chart(data).encode(
x=alt.X('x:O', axis=None),
y=alt.Y('y:O', axis=None)
).configure_view(strokeWidth=0) # remove border
word_cloud = base.mark_text(baseline='middle').encode(
text='word:N',
color=alt.Color('count:Q', scale=alt.Scale(scheme='goldred')),
size=alt.Size('count:Q', legend=None)
)
Here's the result applied to the same dataset used in the Vega docs:

Solution to identify "Similar Product Images"?

I want to build a cloud based solution in which I would give a pool of images; and then ask for "find similar image to a particular image from this pool of images" !! Pool of images can be like "all t-shirt" images. Hence, similar images mean "t-shirt with similar design/color/sleeves" etc.
Tagging solution won't work as they are at very high level.
AWS Rekognition gives "facial similarities" .. but not "product similarities" .. it does not work like for images of dresses..
I am open to use any cloud providers; but all are providing "tags" of the image which won't help me.
One solution could be that I use some ML framework like MXNet/Tensorflow, create my own models, train them and then use.. But is there any other ready made solution on any of cloud providers ?
ibm-bluemix has an api to find similar images https://www.ibm.com/watson/developercloud/visual-recognition/api/v3/#find_similar
Using the Azure Cognitive Services (Computer Vision to be more precise) you can get categories, tags, a caption and even more info for images. Processing all your images would provide tags for your image pool. And that enables you to get similar images based on (multiple) identical tags.
This feature returns information about visual content found in an image. Use tagging, descriptions, and domain-specific models to identify content and label it with confidence. Apply the adult/racy settings to enable automated restriction of adult content. Identify image types and color schemes in pictures.
An example of (part of) a result of the Computer Vision API:
Description{
"Tags": [
"train",
"platform",
"station",
"building",
"indoor",
"subway",
"track",
"walking",
"waiting",
"pulling",
"board",
"people",
"man",
"luggage",
"standing",
"holding",
"large",
"woman",
"yellow",
"suitcase"
],
"Captions": [
{
"Text": "people waiting at a train station",
"Confidence": 0.8331026
}
]
}
Tags[
{
"Name": "train",
"Confidence": 0.9975446
},
{
"Name": "platform",
"Confidence": 0.995543063
},
{
"Name": "station",
"Confidence": 0.9798007
},
{
"Name": "indoor",
"Confidence": 0.927719653
},
{
"Name": "subway",
"Confidence": 0.838939846
},
{
"Name": "pulling",
"Confidence": 0.431715637
}
]
You could also use the Bing Image Search API (https://azure.microsoft.com/en-us/services/cognitive-services/bing-image-search-api/) which allows you to do an image search based on specified criteria within your solution...
You could use a combination of things here. Use an image tagging service aws rekognition, or any of those listed above) then create some training data with like images, and upload that into something like aws machine learning. This is a bit similar to what has been brought up earlier, however I am trying to elucidate that while tagging might not be the end all be all of your solution, it will likely play a role as a first to step to a more complex process.
pls check the site http://cloudsight.ai/api, and try the demo. the sample result would be
{
"token": "BBKA0lW9O-B2eamXUysdXA",
"url": "http://assets.cloudsight.ai/uploads/image_request/image/314/314978/314978186/79379_86cb4e2611d6b0a3287a926a1ca1fe51_image1_zoom.jpg",
"ttl": 54,
"status": "completed",
"name": "men's red and black checkered button-up shirt"
}
{
"token": "bjX7nWGs0toajIDwyvXxlw",
"url": "http://assets.cloudsight.ai/uploads/image_request/image/314/314987/314987168/11.jpg",
"ttl": 54,
"status": "completed",
"name": "blue, gray and navy blue stripe crew-neck T-shirt"
}

How to visualize NodeJS .cpuprofile

I use v8-profiler to profile my NodeJS app. It generates a .cpuprofile file.
I used to be able to visualize the content of the file with Google Chrome built-in DevTools. However, Chrome recently changed the file format for profiling results and Chrome is no longer able to read .cpuprofile files.
Note: My goal is to see the call tree and bottom-up. I do not care about flame chart.
Thanks.
I ended up downloading an old Chromium version. http://commondatastorage.googleapis.com/chromium-browser-continuous/index.html?prefix=Win_x64/381909/
There is a vscode extension for viewing .cpuprofile:
Flame Chart Visualizer for JavaScript Profiles
https://marketplace.visualstudio.com/items?itemName=ms-vscode.vscode-js-profile-flame
Yes, it seems the format has changed. From NodeJS v9.11.1 I'm getting a tree-like JSON structure:
{
"typeId": "CPU",
"uid": "1",
"title": "Profile 1",
"head": {
"functionName": "(root)",
"url": "",
"lineNumber": 0,
"callUID": 1319082045,
"bailoutReason": "no reason",
"id": 17,
"hitCount": 0,
"children": [
{
"functionName": "(anonymous function)",
"url": "...",
"lineNumber": 726,
"callUID": 3193325993,
"bailoutReason": "no reason",
"id": 16,
"hitCount": 0,
"children": [
{
...
From Chromium 66.0.3359.117 I'm getting a flat structure:
{
"nodes": [
{
"id": 1,
"callFrame": {
"functionName": "(root)",
"scriptId": "0",
"url": "",
"lineNumber": -1,
"columnNumber": -1
},
"hitCount": 0,
"children": [
2,
3
]
},
{
...
What worked for me is the chrome2calltree tool, which takes the old format used by NodeJS and turns it into a .prof file that tools like KCacheGrind and QCacheGrind can open.

Resources