maxminddb-rust, get value of country for certain language

maxminddb-rust, get value of country for certain language - rust

I decided to start learning Rust, I haven't finished their book yet but I am trying to build and run other projects so I can learn from the source code. I am now interested in maxmind-rust crate and specifically I want to retrieve country, city and asn values from the .mmdb file.
I tried to convert struct maxmind::geoip2::Country to string and use json crate but resulted in errors that I couldn't fix myself.
The code used:
use maxminddb::geoip2;
use std::net::IpAddr;
use std::str::FromStr;
fn main()
{
let mmdb_file = maxminddb::Reader::open("C:\\path\\to\\GeoLite2-City.mmdb").unwrap();
let ip_addr: IpAddr = FromStr::from_str("8.8.8.8").unwrap();
let geoip2_country: geoip2::Country = mmdb_file.lookup(ip_addr).unwrap();
println!("{:?}", geoip2_country);
}
The output is:
Country
{
continent: Some(Continent
{
code: Some("NA"),
geoname_id: Some(6255149),
names: Some(
{
"de": "Nordamerika",
"en": "North America",
"es": "Norteam?rica",
"fr": "Am?rique du Nord",
"ja": "?????",
"pt-BR": "Am?rica do Norte",
"ru": "???????? ???????",
"zh-CN": "???"
})
}),
country: Some(Country
{
geoname_id: Some(6252001),
iso_code: Some("US"),
names: Some(
{
"de": "USA",
"en": "United States",
"es": "Estados Unidos",
"fr": "?tats-Unis",
"ja": "???????",
"pt-BR": "Estados Unidos",
"ru": "???",
"zh-CN": "??"
})
}),
registered_country: Some(Country
{
geoname_id: Some(6252001),
iso_code: Some("US"),
names: Some(
{
"de": "USA",
"en": "United States",
"es": "Estados Unidos",
"fr": "?tats-Unis",
"ja": "???????",
"pt-BR": "Estados Unidos",
"ru": "???",
"zh-CN": "??"
})
}),
represented_country: None,
traits: None
}
which is the maxminddb::geoip2::Country struct (http://oschwald.github.io/maxminddb-rust/maxminddb/geoip2/struct.Country.html)
Changing the last line of code to
println!("{:?}", geoip2_country.country);
outputs only the country field:
country: Some(Country
{
geoname_id: Some(6252001),
iso_code: Some("US"),
names: Some(
{
"de": "USA",
"en": "United States",
"es": "Estados Unidos",
"fr": "?tats-Unis",
"ja": "???????",
"pt-BR": "Estados Unidos",
"ru": "???",
"zh-CN": "??"
})
}),
But looking at the structure of maxminddb::geoip2::model::Country (http://oschwald.github.io/maxminddb-rust/maxminddb/geoip2/model/struct.Country.html), I am very confused as to how I can retrieve from this struct and its pair (language_code, country_name) if I wanted to get the name of country for "en" key.

You could have a look at how the tests are written and it will give you some idea how to extract the information you are looking for:
https://github.com/oschwald/maxminddb-rust/blob/master/src/maxminddb/reader_test.rs
(actually, after looking at the tests I didn't find any example extracting what you are looking for)
That should return what you expect:
let country_name = geoip2_country.country
.and_then(|cy| cy.names)
.and_then(|n| n.get("en")
.map(String::from));

Related

Idiomatic Rust for Vec<Vec<String>> filter

Very new to programming and for some reason i chose rust to start with but i digress...
Current code to take a reference to a vec of vec strings, compare one 'Column' against some array and return the row and a truncated version of the row.
The below compiles, but i keep re-using this same logic across all my functions and would be really useful to just have some kind of closure.
fn niacs(vec:&Vec<Vec<String>>, naics: Vec<&str> ) -> (Vec<Vec<String>>,Vec<Vec<String>>) {
let mut return_vec_truncated = vec![];
let mut return_vec_full = vec![];
let mut contains_naice = false;
for i in vec.iter() {
for naic in naics.iter(){
if i[31] == *naic {
contains_naice = true;
}
if contains_naice {
let x = info_wanted(&i);
return_vec_truncated.push(x);
return_vec_full.push(i.clone());
}
contains_naice = false;
}
}
(return_vec_full, return_vec_truncated)
}
What i would like to be able to do is write something like:
let x = vec.iter().map(|x| x.iter()).filter(|x| x != naics);
The problem is.. I don't want to map across all of the elements per say, i just want to make across one column within the inner vec, which is column 31.
Sample Vec of Vec:
["471", "001020887", "", "1SNH0", "", "A", "Z2", "20020312", "20210711", "20200731", "20200731", "BENCHMARK INTERNATIONAL, INC", "", "", "", "5025 PIRATES COVE RD", "", "JACKSONVILLE", "FL", "32210", "8309", "USA", "04", "19990208", "1231", "http://www.bmiint.com", "2L", "VA", "USA", "0005", "27~2X~A5~QF~XS", "541611", "", "0002", "541611Y~541690Y", "0000", "", "N", "", "5025 PIRATES COVE RD", "", "JACKSONVILLE", "32210", "8309", "USA", "FL", "ANNA", "", "MCKENZIE", "", "5025 PIRATES COVE RD", "", "JACKSONVILLE", "32210", "8309", "USA", "FL", "4437170460", "", "", "", "amckenzie#bmiint.com"], ["472", "001021310", "", "94867", "", "A", "Z2", "19980424", "20210323", "20200323", "20200323", "CHURCHILL CORPORATION", "", "", "", "344 FRANKLIN ST", "", "MELROSE", "MA", "02176", "1825", "USA", "05", "19480101", "0731", "", "2L", "MA", "USA", "0002", "2X~MF", "332322", "", "0001", "332322Y", "0000", "", "Y", "", "P.O.BOX 761038", "", "MELROSE", "02176", "1825", "USA", "MA", "MARSHALL", "W", "SCHERMERHORN", "", "P.O. BOX 761038", "344 FRANKLIN STREET", "MELROSE", "02176", "", "USA", "MA", "7816654700", "", "", "7816625291", "marshall#atrbox.com"]]

First, to confirm that I'm not overlooking something:
Your code is equal to
fn niacs(vec: &Vec<Vec<String>>, naics: Vec<&str>) -> (Vec<Vec<String>>, Vec<Vec<String>>) {
let mut return_vec_truncated = vec![];
let mut return_vec_full = vec![];
for i in vec.iter() {
for naic in naics.iter() {
if i[31] == *naic {
return_vec_truncated.push(info_wanted(&i));
return_vec_full.push(i.clone());
}
}
}
(return_vec_full, return_vec_truncated)
}
and the contains_naice is unnecessary? Except for contains_naice, I think your code easy to read and plenty idiomatic.
If you absolutely want to write it with iterators, you can use flat_map and unzip:
fn niacs(vec: &Vec<Vec<String>>, naics: Vec<&str>) -> (Vec<Vec<String>>, Vec<Vec<String>>) {
vec.iter()
.flat_map(|i| naics.iter().filter(|naic| i[31] == **naic).map(move |_| i))
.map(|i| (i.clone(), info_wanted(&i)))
.unzip()
}
Though I do wonder: Is that actually what you wanted? Do you really want one copy of i per matching naic? Or do you maybe want one copy of i if any of the naics match?
fn niacs(vec: &Vec<Vec<String>>, naics: Vec<&str>) -> (Vec<Vec<String>>, Vec<Vec<String>>) {
vec.iter()
.filter(|i| naics.iter().any(|naic| i[31] == *naic))
.map(|i| (i.clone(), info_wanted(&i)))
.unzip()
}
(The equal iterative code would have a break inside the if.)

Update data from one Map to another

How can I do the following.
Suppose I have the following document created
{
"_id": "12345",
"email": "julio#gmail.com",
"password": "julio123",
"created": "2021-08-17",
"status": "0"
}
and now through the created document I wish to have a new document like the following one.
{
"email": "julio#gmail.com",
"password": "julio123",
"status": "1"
}
I don't know if what I did is correct, but I know that more code is missing.
#[cfg(test)]
mod tests {
use serde_json::{Map, json};
#[test]
fn prueba() {
let mut user = Map::new();
user.insert("id".to_string(), json!("12345"));
user.insert("email".to_string(), json!("juan#gmail.com"));
user.insert("password".to_string(), json!("juan123"));
user.insert("created".to_string(), json!("2021-08-17"));
user.insert("status".to_string(), json!("0"));
println!("user: {:#?}", user);
let user2 = user.into_iter().fold(Map::new(), |mut user_map, (key, val)| {
user_map.insert(key.clone(), val.clone());
user_map
});
println!("user2: {:#?}", user2);
}
}
I'm just learning to use functional programming, I need help to get the best possible result.

What you seem to want to achieve is to filter and update some fields, creating a new object as a result.
Here is how you might do that:
let user2 = user
.into_iter()
.filter_map(|(k, v)| match k.as_str() {
"email" => Some((k, v)),
"password" => Some((k, v)),
"status" => Some((k, json!("1"))),
_ => None,
})
.collect::<Map<_, _>>();

How to extract selected key and value from nested dictionary object in a list?

I have a list example_list contains two dict objects, it looks like this:
[
{
"Meta": {
"ID": "1234567",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "apple",
"price": "0.111"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
},
{
"Meta": {
"ID": "78945612",
"XXX": "XXX"
},
"bbb": {
"ccc": {
"ddd": {
"eee": {
"fff": {
"xxxxxx": "xxxxx"
},
"www": [
{
"categories": {
"ppp": [
{
"content": {
"name": "banana",
"price": "12.599"
},
"xxx: "xxx"
}
]
},
"date": "A2020-01-01"
}
]
}
}
}
}
}
]
now I want to filter the items and only keep "ID": "xxx" and the correspoding value for "price": "0.111", expected result can be something similar to :
[{"ID": "1234567", "price": "0.111"}, {"ID": "78945612", "price": "12.599"}]
or something like {"1234567":"0.111", "78945612":"12.599" }
Here's what I've tried:
map_list=[]
map_dict={}
for item in example_list:
#get 'ID' for each item in 'meta'
map_dict['ID'] = item['meta']['ID']
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
The result for this doesn't look right, feels like the item isn't iterating properly, it gives me result:
[{"ID": "78945612", "price": "12.599"}, {"ID": "78945612", "price": "12.599"}]
It gave me duplicated result for the second ID but where is the first ID?
Can someone take a look for me please, thanks.
Update:
From some comments from another question, I understand the reason for the output keeps been overwritten is because the key name in the dict is always the same, but I'm not sure how to fix this because the key and value needs to be extracted from different level of for loops, any help would be appreciated, thanks.

as #Scott Hunter has mentioned, you need to create a new map_dict everytime you are trying to do this. Here is a quick fix to your solution (I am sadly not able to test it right now, but it seems right to me).
map_list=[]
for item in example_list:
# get 'price'
data_list = item['bbb']['ccc']['ddd']['www']
for data in data_list:
for dataitem in data['categories']['ppp']:
map_dict={}
map_dict['ID'] = item['meta']['ID']
map_dict['price'] = item["content"]["price"]
map_list.append(map_dict)
print(map_list)
But what are you doing here is that you are basically just "forcing" your way through ... I recommend you to take a break and check out somekind of tutorial, which will help you to understand how it really works in the back-end. This is how I would have written it:
list_dicts = []
for example in example_list:
for www in item['bbb']['ccc']['ddd']['www']:
for www_item in www:
list_dicts.append({
'ID': item['meta']['ID'],
'price': www_item["content"]["price"]
})
Good luck with this problem and hope it helps :)

You need to create a new dictionary for map_dict for each ID.

PySpark Dataframe to Json - grouping data

We are trying to create a json from a dataframe. Please find the dataframe below,
+----------+--------------------+----------+--------------------+-----------------+--------------------+---------------+--------------------+---------------+--------------------+--------------------+
| CustId| TIN|EntityType| EntityAttributes|AddressPreference| AddressDetails|EmailPreference| EmailDetails|PhonePreference| PhoneDetails| MemberDetails|
+----------+--------------------+----------+--------------------+-----------------+--------------------+---------------+--------------------+---------------+--------------------+--------------------+
|1234567890|XXXXXXXXXXXXXXXXXX...| Person|[{null, PRINCESS,...| Alternate|[{Home, 460 M XXX...| Primary|[{Home, HEREBY...| Alternate|[{Home, {88888888...|[{7777777, 999999...|
|1234567890|XXXXXXXXXXXXXXXXXX...| Person|[{null, PRINCESS,...| Alternate|[{Home, 460 M XXX...| Primary|[{Home, HEREBY...| Primary|[{Home, {88888888...|[{7777777, 999999...|
|1234567890|XXXXXXXXXXXXXXXXXX...| Person|[{null, PRINCESS,...| Primary|[{Home, PO BOX 695020...| Primary|[{Home, HEREBY...| Alternate|[{Home, {88888888...|[{7777777, 999999...|
|1234567890|XXXXXXXXXXXXXXXXXX...| Person|[{null, PRINCESS,...| Primary|[{Home, PO BOX 695020...| Primary|[{Home, HEREBY...| Primary|[{Home, {88888888...|[{7777777, 999999...|
+----------+--------------------+----------+--------------------+-----------------+--------------------+---------------+--------------------+---------------+--------------------+--------------------+
So the initial columns custid, TIN, Entitytype,EntityAttributes will be same for a particular customer, say 1234567890 in our example. But he might be having multiple addresses/phone/email. Could you please help us on how to group them under 1 json.
Expected Structure :
{
"CustId": 1234567890,
"TIN": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"EntityType": "Person",
"EntityAttributes": [
{
"FirstName": "PRINCESS",
"LastName": "XXXXXX",
"BirthDate": "xxxx-xx-xx",
"DeceasedFlag": "False"
}
],
"Address": [
{
"AddressPreference": "Alternate",
"AddressDetails": {
"AddressType": "Home",
"Address1": "460",
"City": "XXXX",
"State": "XXX",
"Zip": "XXXX"
}
},
{
"AddressPreference": "Primary",
"AddressDetails": {
"AddressType": "Home",
"Address1": "PO BOX 695020",
"City": "XXX",
"State": "XXXX",
"Zip": "695020",
}
}
],
"Phone": [
{
"PhonePreference": "Primary",
"PhoneDetails": {
"PhoneType": "Home",
"PhoneNumber": "xxxxx",
"FormatPhoneNumber": "xxxxxx"
}
},
{
"PhonePreference": "Alternate",
"PhoneDetails": {
"PhoneType": "Home",
"PhoneNumber": "xxxx",
"FormatPhoneNumber": "xxxxx"
}
},
{
],
"Email": [
{
"EmailPreference": "Primary",
"EmailDetails": {
"EmailType": "Home",
"EmailAddress": "xxxxxxx#GMAIL.COM"
}
}
],
}
]
}
UPDATE
Tried with the below recommended group by method, it ended up giving 1 customer details, but the email is repeated 4 times in the list. Ideally it should be having only 1 email. Also In the Address Preference Alternate has 1 address and primary has 1 address, but the Alternate shows 2 entries and primary shows 2. Could you please help with an ideal solution.

Probably this should work. id is like a custid in your example which has repeating values.
>>> df.show()
+----+------------+----------+
| id| address| email|
+----+------------+----------+
|1001| address-a| email-a|
|1001| address-b| email-b|
|1002|address-1002|email-1002|
|1003|address-1003|email-1002|
|1002| address-c| email-2|
+----+------------+----------+
Aggregate on those repeating columns and then convert to JSON
>>> results = df.groupBy("id").agg(collect_list("address").alias("address"),collect_list("email").alias("email")).toJSON().collect()
>>> for i in results: print(i)
...
{"id":"1003","address":["address-1003"],"email":["email-1002"]}
{"id":"1002","address":["address-1002","address-c"],"email":["email-1002","email-2"]}
{"id":"1001","address":["address-a","address-b"],"email":["email-a","email-b"]}

How do I associate a value in a dictionary with a value in a nested dictionary?

I'm trying to figure out how to return the value for participantId when inputting the value for summonerName. I thought of something like participantIdentities.index(...) but I'm working with a list of dictionaries that contain nested dictionaries, so I'm really not sure on this.
Example:
Input: dd god
Output: 1
"participantIdentities": [
{
"participantId": 1,
"player": {
"accountId": "g7cSjy8G4hMab3ayDaY8cqOSSjztYvktybRT_XgkBsJUSD0",
"currentAccountId": "g7cSjy8G4hMab3ayDaY8cqOSSjztYvktybRT_XgkBsJUSD0",
"currentPlatformId": "NA1",
"matchHistoryUri": "/v1/stats/player_history/NA1/233825986",
"platformId": "NA1",
"profileIcon": 1453,
"summonerId": "ICw1u2Kv-lHR_bjaNPa6BHbpwGT5rqJJJIVfiqcpbBdy9LM",
"summonerName": "dd god"
}
},
{
"participantId": 2,
"player": {
"accountId": "ZJ3NohMpa_FZHHSxyFBOxjyuU6JpL-LEbctTPV2pDeuNbw",
"currentAccountId": "oS_oSZLMTC3ZYVYiehAR6ZA4Gly-qq_WT_c5uXvQRRryzlw",
"currentPlatformId": "NA1",
"matchHistoryUri": "/v1/stats/player_history/EUW1/22181515",
"platformId": "EUW1",
"profileIcon": 3271,
"summonerId": "PlJSBls3iy0emnKRlwFZqvye8Plwnbp5ngG_NJ6JQmDI1nE",
"summonerName": "TSM Bjergsen"
}
}
]

You can do something like this:
dict_ = {"participantIdentities": [
{
"participantId": 1,
"player": {
"accountId": "g7cSjy8G4hMab3ayDaY8cqOSSjztYvktybRT_XgkBsJUSD0",
"currentAccountId": "g7cSjy8G4hMab3ayDaY8cqOSSjztYvktybRT_XgkBsJUSD0",
"currentPlatformId": "NA1",
"matchHistoryUri": "/v1/stats/player_history/NA1/233825986",
"platformId": "NA1",
"profileIcon": 1453,
"summonerId": "ICw1u2Kv-lHR_bjaNPa6BHbpwGT5rqJJJIVfiqcpbBdy9LM",
"summonerName": "dd god"
}
},
{
"participantId": 2,
"player": {
"accountId": "ZJ3NohMpa_FZHHSxyFBOxjyuU6JpL-LEbctTPV2pDeuNbw",
"currentAccountId": "oS_oSZLMTC3ZYVYiehAR6ZA4Gly-qq_WT_c5uXvQRRryzlw",
"currentPlatformId": "NA1",
"matchHistoryUri": "/v1/stats/player_history/EUW1/22181515",
"platformId": "EUW1",
"profileIcon": 3271,
"summonerId": "PlJSBls3iy0emnKRlwFZqvye8Plwnbp5ngG_NJ6JQmDI1nE",
"summonerName": "TSM Bjergsen"
}
}
]
}
input_val = "dd god"
identities = dict_['participantIdentities']
for p in identities:
if p['player']['summonerName'] == input_val:
print(p['participantId']
With the for loop you can iterate through your list of dictionaries and your nested dictionaries are accessable like a more dimensional array:
dict['a']['b'][...]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

maxminddb-rust, get value of country for certain language - rust

Related

Idiomatic Rust for Vec<Vec<String>> filter

Update data from one Map to another

How to extract selected key and value from nested dictionary object in a list?

PySpark Dataframe to Json - grouping data

How do I associate a value in a dictionary with a value in a nested dictionary?

Categories

Resources