Why is this test not passing with fastxmlparser
import parser from 'fast-xml-parser'
test("fastxmlparser", () => {
let parsed = parser.parse('<detail><name>john</name><value>116347579610481033</value></detail>')
expect(parsed.detail.value).toBe('116347579610481033')
})
Expected value to be:
"116347579610481033"
Received:
116347579610481040
Difference:
Comparing two different types of values. Expected string but received number.
95 | test("fastxmlparser", () => {
96 | let parsed = parser.parse('<detail><name>john</name><value>116347579610481033</value></detail>')
> 97 | expect(parsed.detail.value).toBe('116347579610481033')
98 | })
99 |
100 |
Related
I'm new to Sphinx, i have simple table tbl_urls with two columns (domain_id,url)
i created my index as below to get domain id and number of urls for any giving keyword
source src2
{
type = mysql
sql_host = 0.0.0.0
sql_user = spnx
sql_pass = 123
sql_db = db_spnx
sql_port = 3306 # optional, default is 3306
sql_query = select id,domain_id,url from tbl_domain_urls
sql_attr_uint = domain_id
sql_field_string = url
}
index url_tbl
{
source = src2
path =/var/lib/sphinx/data/url_tbl
}
indexer
{
mem_limit = 2047M
}
searchd
{
listen = 0.0.0.0:9312
listen = 0.0.0.0:9306:mysql41
listen = /home/charlie/sphinx-3.4.1/bin/searchd.sock:sphinx
log = /var/log/sphinx/sphinx.log
query_log = /var/log/sphinx/query.log
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinx/sphinx.pid
max_filter_values = 20000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
workers = threads # for RT indexes to work
binlog_path = /var/lib/sphinx/data
max_batch_queries = 128
}
problem is the time taken to show results is over one min
SELECT domain_id,count(*) as url_counter
FROM ul_tbl WHERE MATCH('games')
group by domain_id limit 1000000 OPTION max_matches=1000000;show meta;
+-----------+-------+
| domain_id | url |
+-----------+-------+
| 9900 | 444 |
| 41309 | 48 |
| 62308 | 491 |
| 85798 | 401 |
| 595 | 4851 |
13545 rows in set (3 min 22.56 sec)
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| total | 13545 |
| total_found | 13545 |
| time | 1.406 |
| keyword[0] | games |
| docs[0] | 456667 |
| hits[0] | 514718 |
+---------------+--------+
table tbl_domain_urls 100,821,614 rows
dedicated server HP Proliant 2xL5420 16GB RAM 2x1TB HDD
I need your support to optimize my QUERY or config settings, i need the results in the lowest time possible, i really appreciate any new idea to test
Note:
I tried distributed index to use multiple core for processing without any noticable results
im having 2 list of different variable, so i want to compare and update the 'Check' value from list 2 if the 'Brand' from list 2 is found in list 1
-------------------- --------------------
| Name | Brand | | Brand | Check |
-------------------- --------------------
| vga x | Asus | | MSI | X |
| vga b | Asus | | ASUS | - |
| mobo x | MSI | | KINGSTON | - |
| memory | Kingston| | SAMSUNG | - |
-------------------- --------------------
so usually i just did
for(x in list1){
for(y in list2){
if(y.brand == x.brand){
y.check == true
}
}
}
is there any simple solution for that?
Since you're mutating the objects, it doesn't really get any cleaner than what you have. It can be done using any like this, but in my opinion is not any clearer to read:
list2.forEach { bar ->
bar.check = bar.check || list1.any { it.brand == bar.brand }
}
The above is slightly more efficient than what you have since it inverts the iteration of the two lists so you don't have to check every element of list1 unless it's necessary. The same could be done with yours like this:
for(x in list2){
for(y in list1){
if(y.brand == x.brand){
x.check = true
break
}
}
}
data class Item(val name: String, val brand: String)
fun main() {
val list1 = listOf(
Item("vga_x", "Asus"),
Item("vga_b", "Asus"),
Item("mobo_x", "MSI"),
Item("memory", "Kingston")
)
val list2 = listOf(
Item("", "MSI"),
Item("", "ASUS"),
Item("", "KINGSTON"),
Item("", "SAMSUNG")
)
// Get intersections
val intersections = list1.map{it.brand}.intersect(list2.map{it.brand})
println(intersections)
// Returns => [MSI]
// Has any intersections
val intersected = list1.map{it.brand}.any { it in list2.map{it.brand} }
println(intersected)
// Returns ==> true
}
UPDATE: I just see that this isn't a solution for your problem. But I'll leave it here.
I am having a bit of trouble figuring out how to get around missing full row errors when having a list of json objects which have optional fields, like this example:
let
Source = Json.Document("[
{ ""name"": ""Peter"", ""age"": 42, ""email"": ""something""},
{ ""name"": ""Peter"", ""age"": 42 }]"),
Tabled = Table.FromRecords(Source)
in
Tabled
That gives me a big fat error on the second row:
# | name | age | email |
--------------------------------
1 | Peter | 42 | something |
2 | Error | Error | Error |
Expression.Error: The field 'email' of the record wasn't found.
Details:
name=Peter
age=42
But I really just wan't it to "ignore" that, so I get something like:
# | name | age | email |
--------------------------------
1 | Peter | 42 | something |
2 | Peter | 42 | |
Ok so managed to find a solution that is ok for now in my case, although i think a better one could certainly be made as it is a bit crude...
let
Source = Json.Document("[
{ ""name"": ""Peter"", ""age"": 42, ""email"": ""something""},
{ ""name"": ""Peter"", ""age"": 42 }]"),
Transformed = List.Transform(Source, each Record.TransformFields(_, {
{ "email", Text.Trim },
{ "name", Text.Trim },
{ "age", Int64.From }
}, MissingField.UseNull)),
Tabled = Table.FromRecords(Transformed)
in
Tabled
Which yields
# | name | age | email |
--------------------------------
1 | Peter | 42 | something |
2 | Peter | 42 | null |
(null goes away when applied to a sheet)
Ideally something that would require far less "code" would be ideal, but for now this will do.
If anyone has any better solutions feel free to share >.<
Table.FromRecords() stop parse when missing fields, use Table.FromList() instead.
Try below, you can use expand menu genereate table2 code.
let
Source = Json.Document("[
{ ""name"": ""Peter"", ""age"": 42, ""email"": ""something""},
{ ""name"": ""Peter"", ""age"": 42 }]"),
table1 = Table.FromList(Source,Splitter.SplitByNothing(),null,null,ExtraValues.Error),
table2 = Table.ExpandRecordColumn(table1, "Column1", {"name", "age", "email"}, {"Column1.name", "Column1.age", "Column1.email"})
in
table2
I am developing a Spark Streaming application where I want to have one global numeric ID per item in my data stream. Having an interval/RDD-local ID is trivial:
dstream.transform(_.zipWithIndex).map(_.swap)
This will result in a DStream like:
// key: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 || 0 | 1 | 2 | 3 | 4 || 0
// val: a | b | c | d | e | f | g | h | i || j | k | l | m | n || o
(where the double bar || indicates the beginning of a new RDD).
What I finally want to have is:
// key: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 || 9 | 10 | 11 | 12 | 13 || 14
// val: a | b | c | d | e | f | g | h | i || j | k | l | m | n || o
How can I do that in a safe and performant way?
This seems like a trivial task, but I feel it very hard to preserve state (state = "number of items seen so far") between RDDs. Here are two approaches I tried, updating the number of seen so far (plus the number in the current interval) using updateStateByKey with a bogus key:
val intervalItemCounts = inputStream.count().map((1, _))
// intervalItemCounts looks like:
// K: 1 || 1 || 1
// V: 9 || 5 || 1
val updateCountState: (Seq[Long], Option[ItemCount]) => Option[ItemCount] =
(itemCounts, maybePreviousState) => {
val previousState = maybePreviousState.getOrElse((0L, 0L))
val previousItemCount = previousState._2
Some((previousItemCount, previousItemCount + itemCounts.head))
}
val totalNumSeenItems: DStream[ItemCount] = intervalItemCounts.
updateStateByKey(updateCountState).map(_._2)
// totalNumSeenItems looks like:
// V: (0,9) || (9,14) || (14,15)
// The first approach uses a cartesian product with the
// 1-element state DStream. (Is this performant?)
val increaseRDDIndex1: (RDD[(Long, Char)], RDD[ItemCount]) =>
RDD[(Long, Char)] =
(streamData, totalCount) => {
val product = streamData.cartesian(totalCount)
product.map(dataAndOffset => {
val ((localIndex: Long, data: Char),
(offset: Long, _)) = dataAndOffset
(localIndex + offset, data)
})
}
val globallyIndexedItems1: DStream[(Long, Char)] = inputStream.
transformWith(totalNumSeenItems, increaseRDDIndex1)
// The second approach uses a take() output operation on the
// 1-element state DStream beforehand. (Is this valid?? Will
// the closure be serialized and shipped in every interval?)
val increaseRDDIndex2: (RDD[(Long, Char)], RDD[ItemCount]) =>
RDD[(Long, Char)] = (streamData, totalCount) => {
val offset = totalCount.take(1).head._1
streamData.map(keyValue => (keyValue._1 + offset, keyValue._2))
}
val globallyIndexedItems2: DStream[(Long, Char)] = inputStream.
transformWith(totalNumSeenItems, increaseRDDIndex2)
Both approaches give the correct result (with local[*] master), but I am wondering about performance (shuffle etc.), whether it works in a truly distributed environment and whether it shouldn't be a lot easier than that...
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I am getting a dataset using LINQ to SQL. I need to filter this dataset such that:
If a field with a null SourceName exists and there's at least one other record for this field with a non-null SourceName, then it should be removed.
If it is the only row for that 'Field', then it should remain in the list.
Here's an example data: Data consists of 3 columns: 'Field', 'SourceName' and 'Rate'
Field | SourceName | Rate
10 | s1 | 9
10 | null | null
11 | null | null
11 | s2 | 5
11 | s3 | 4
12 | null | null
13 | null | null
13 | s4 | 7
13 | s5 | 8
8 | s6 | 2
9 | s7 | 23
9 | s8 | 9
9 | s9 | 3
Output should look like:
Field | SourceName | Rate
10 | s1 | 9
11 | s2 | 5
11 | s3 | 4
12 | null | null // <- (remains since there's only
13 | s4 | 7 // 1 record for this 'Field')
13 | s5 | 8
8 | null | null
9 | s8 | 9
9 | s9 | 3
How do I filter it?
What you are trying to achieve is not trivial and can't be solved with just a .Where() clause. Your filter criteria depends on a condition that requires grouping, so you will have to .GroupBy() and then flatten that collection of collections using .SelectMany().
The following code satisfies your expected output using LINQ to Objects, and I don't see any reason for LINQ to SQL not to be able to translate it to SQL, haven't tried that tough.
//Group by the 'Field' field.
yourData.GroupBy(x => x.Field)
//Project the grouping to add a new 'IsUnique' field
.Select(g => new {
SourceAndRate = g,
IsUnique = g.Count() == 1,
})
//Flatten the collection using original items, plus IsUnique
.SelectMany(t => t.SourceAndRate, (t, i) => new {
Field = t.SourceAndRate.Key,
SourceName = i.SourceName,
Rate = i.Rate,
IsUnique = t.IsUnique
})
//Now we can do the business here; filter nulls except unique
.Where(x => x.SourceName != null || x.IsUnique);
Use Linq's built in 'Where' clause with a lambda continuation:
Simple static example of using lambda's and a simple POCO class to store the data in a list like yours:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Simple
{
class Program
{
class Data
{
public string Field { get; set; }
public string SourceName { get; set; }
public string Rate { get; set; }
}
static List<Data> Create()
{
return new List<Data>
{
new Data {Field = "10", SourceName = null, Rate = null},
new Data {Field = "11", SourceName = null, Rate = null},
new Data {Field = "11", SourceName = "s2", Rate = "5"}
};
}
static void Main(string[] args)
{
var ls = Create();
Console.WriteLine("Show me my whole list: \n\n");
// write out everything
ls.ForEach(x => Console.WriteLine(x.Field + "\t" + x.SourceName + "\t" + x.Rate + "\n"));
Console.WriteLine("Show me only non nulls: \n\n");
// exclude some things
ls.Where(l => l.SourceName != null)
.ToList()
.ForEach(x => Console.WriteLine(x.Field + "\t" + x.SourceName + "\t" + x.Rate + "\n"));
Console.ReadLine();
}
}
}