I need generic way to filter IQueryable data and filters are populated as dictionary. I have already created method like this.
public static IEnumerable<T> CustomApplyFilter<T>(this IQueryable<T> source, Dictionary<string, string> filterBy)
{
foreach (var key in filterBy.Keys)
{
source.Where(m => m.GetType().GetProperty(key).GetValue(m, null).Equals(filterBy[key]));
}
return source.ToList();
}
But its always returning same result.
please find the caller
Dictionary<string, string> dtFilter = new Dictionary<string, string>();
dtFilter.Add("Id", "2");
var res = context.Set<MyEntity>().CustomApplyFilter<MyEntity>(dtFilter);
The Where extension method does not change the content of the IQueryable it is applied to. The return value of the method should be used:
public static IEnumerable<T> CustomApplyFilter<T>(this IQueryable<T> source, Dictionary<string, string> filterBy)
{
foreach (var key in filterBy.Keys)
{
source = source.Where(m => m.GetType().GetProperty(key).GetValue(m, null).Equals(filterBy[key]));
}
return source.ToList();
}
UPDATE:
I should have noticed it, my answer so far was applicable to LINQ to Objects only. When using LINQ to Entities, however, there are certain restrictions; only expression that can be converted to an SQL query can be used. Getting properties through reflection is not such an expression obviously.
When this is the case, one possible solution would be to build the ExpressionTree manually.
public static IEnumerable<T> CustomApplyFilter<T>(this IQueryable<T> source, Dictionary<string, string> filterBy)
{
foreach (var key in filterBy.Keys)
{
var paramExpr = Expression.Parameter(typeof(T), key);
var keyPropExpr = Expression.Property(paramExpr, key);
var eqExpr = Expression.Equal(keyPropExpr, Expression.Constant(filterBy[key]));
var condExpr = Expression.Lambda<Func<T, bool>>(eqExpr, paramExpr);
source = source.Where(condExpr);
}
return source.ToList();
}
UPDATE2:
With the comment #Venkatesh Kumar given below, it is apparent that when the underlying type of the field provided is not of type string, this solution fails (with the error message : The binary operator Equal is not defined for the types 'System.Int64' and 'System.String').
One possible way to tackle this problem would be to have a dictionary of types and delegates to use for each such property.
Since this is a static method (an extension method which has to be static), declaring a static Dictionary in class scope would be reasonable:
Let's assume the name of the class in which CustomApplyFilter is declared is SOFExtensions:
internal static class SOFExtensions
{
private static Dictionary<Type, Func<string, object>> lookup = new Dictionary<Type, Func<string, object>>();
static SOFExtensions()
{
lookup.Add(typeof(string), x => { return x; });
lookup.Add(typeof(long), x => { return long.Parse(x); });
lookup.Add(typeof(int), x => { return int.Parse(x); });
lookup.Add(typeof(double), x => { return double.Parse(x); });
}
public static IEnumerable<T> CustomApplyFilter<T>(this IQueryable<T> source, Dictionary<string, string> filterBy)
{
foreach (var key in filterBy.Keys)
{
var paramExpr = Expression.Parameter(typeof(T), key);
var keyPropExpr = Expression.Property(paramExpr, key);
if (!lookup.ContainsKey(keyPropExpr.Type))
throw new Exception("Unknown type : " + keyPropExpr.Type.ToString());
var typeDelegate = lookup[keyPropExpr.Type];
var constantExp = typeDelegate(filterBy[key]);
var eqExpr = Expression.Equal(keyPropExpr, Expression.Constant(constantExp));
var condExpr = Expression.Lambda<Func<T, bool>>(eqExpr, paramExpr);
source = source.Where(condExpr);
}
return source.ToList();
}
}
Other types and proper delegates for them should be added to the lookup Dictionary as required.
Is there a way to generate the statement CREATE TABLE from an entity definition? I know it is possible using Achilles but I want to use the regular Cassandra entity.
The target is getting the following script from the entity class below.
Statement
CREATE TABLE user (userId uuid PRIMARY KEY, name text);
Entity
#Table(keyspace = "ks", name = "users",
readConsistency = "QUORUM",
writeConsistency = "QUORUM",
caseSensitiveKeyspace = false,
caseSensitiveTable = false)
public static class User {
#PartitionKey
private UUID userId;
private String name;
// ... constructors / getters / setters
}
Create a class named Utility with package name com.datastax.driver.mapping to access some utils method from that package.
package com.datastax.driver.mapping;
import com.datastax.driver.core.*;
import com.datastax.driver.core.utils.UUIDs;
import com.datastax.driver.mapping.annotations.ClusteringColumn;
import com.datastax.driver.mapping.annotations.Column;
import com.datastax.driver.mapping.annotations.PartitionKey;
import com.datastax.driver.mapping.annotations.Table;
import java.net.InetAddress;
import java.nio.ByteBuffer;
import java.util.*;
/**
* Created by Ashraful Islam
*/
public class Utility {
private static final Map<Class, DataType.Name> BUILT_IN_CODECS_MAP = new HashMap<>();
static {
BUILT_IN_CODECS_MAP.put(Long.class, DataType.Name.BIGINT);
BUILT_IN_CODECS_MAP.put(Boolean.class, DataType.Name.BOOLEAN);
BUILT_IN_CODECS_MAP.put(Double.class, DataType.Name.DOUBLE);
BUILT_IN_CODECS_MAP.put(Float.class, DataType.Name.FLOAT);
BUILT_IN_CODECS_MAP.put(Integer.class, DataType.Name.INT);
BUILT_IN_CODECS_MAP.put(Short.class, DataType.Name.SMALLINT);
BUILT_IN_CODECS_MAP.put(Byte.class, DataType.Name.TINYINT);
BUILT_IN_CODECS_MAP.put(long.class, DataType.Name.BIGINT);
BUILT_IN_CODECS_MAP.put(boolean.class, DataType.Name.BOOLEAN);
BUILT_IN_CODECS_MAP.put(double.class, DataType.Name.DOUBLE);
BUILT_IN_CODECS_MAP.put(float.class, DataType.Name.FLOAT);
BUILT_IN_CODECS_MAP.put(int.class, DataType.Name.INT);
BUILT_IN_CODECS_MAP.put(short.class, DataType.Name.SMALLINT);
BUILT_IN_CODECS_MAP.put(byte.class, DataType.Name.TINYINT);
BUILT_IN_CODECS_MAP.put(ByteBuffer.class, DataType.Name.BLOB);
BUILT_IN_CODECS_MAP.put(InetAddress.class, DataType.Name.INET);
BUILT_IN_CODECS_MAP.put(String.class, DataType.Name.TEXT);
BUILT_IN_CODECS_MAP.put(Date.class, DataType.Name.TIMESTAMP);
BUILT_IN_CODECS_MAP.put(UUID.class, DataType.Name.UUID);
BUILT_IN_CODECS_MAP.put(LocalDate.class, DataType.Name.DATE);
BUILT_IN_CODECS_MAP.put(Duration.class, DataType.Name.DURATION);
}
private static final Comparator<MappedProperty<?>> POSITION_COMPARATOR = new Comparator<MappedProperty<?>>() {
#Override
public int compare(MappedProperty<?> o1, MappedProperty<?> o2) {
return o1.getPosition() - o2.getPosition();
}
};
public static String convertEntityToSchema(Class<?> entityClass) {
Table table = AnnotationChecks.getTypeAnnotation(Table.class, entityClass);
String ksName = table.caseSensitiveKeyspace() ? Metadata.quote(table.keyspace()) : table.keyspace().toLowerCase();
String tableName = table.caseSensitiveTable() ? Metadata.quote(table.name()) : table.name().toLowerCase();
List<MappedProperty<?>> pks = new ArrayList<>();
List<MappedProperty<?>> ccs = new ArrayList<>();
List<MappedProperty<?>> rgs = new ArrayList<>();
Set<? extends MappedProperty<?>> properties = MappingConfiguration.builder().build().getPropertyMapper().mapTable(entityClass);
for (MappedProperty<?> mappedProperty : properties) {
if (mappedProperty.isComputed())
continue; //Skip Computed
if (mappedProperty.isPartitionKey())
pks.add(mappedProperty);
else if (mappedProperty.isClusteringColumn())
ccs.add(mappedProperty);
else
rgs.add(mappedProperty);
}
if (pks.isEmpty()) {
throw new IllegalArgumentException("No Partition Key define");
}
Collections.sort(pks, POSITION_COMPARATOR);
Collections.sort(ccs, POSITION_COMPARATOR);
StringBuilder query = new StringBuilder("CREATE TABLE ");
if (!ksName.isEmpty()) {
query.append(ksName).append('.');
}
query.append(tableName).append('(').append(toSchema(pks));
if (!ccs.isEmpty()) {
query.append(',').append(toSchema(ccs));
}
if (!rgs.isEmpty()) {
query.append(',').append(toSchema(rgs));
}
query.append(',').append("PRIMARY KEY(");
query.append('(').append(join(pks, ",")).append(')');
if (!ccs.isEmpty()) {
query.append(',').append(join(ccs, ","));
}
query.append(')').append(");");
return query.toString();
}
private static String toSchema(List<MappedProperty<?>> list) {
StringBuilder sb = new StringBuilder();
if (!list.isEmpty()) {
MappedProperty<?> first = list.get(0);
sb.append(first.getMappedName()).append(' ').append(BUILT_IN_CODECS_MAP.get(first.getPropertyType().getRawType()));
for (int i = 1; i < list.size(); i++) {
MappedProperty<?> field = list.get(i);
sb.append(',').append(field.getMappedName()).append(' ').append(BUILT_IN_CODECS_MAP.get(field.getPropertyType().getRawType()));
}
}
return sb.toString();
}
private static String join(List<MappedProperty<?>> list, String separator) {
StringBuilder sb = new StringBuilder();
if (!list.isEmpty()) {
sb.append(list.get(0).getMappedName());
for (int i = 1; i < list.size(); i++) {
sb.append(separator).append(list.get(i).getMappedName());
}
}
return sb.toString();
}
}
How to use it ?
System.out.println(convertEntityToSchema(User.class));
Output :
CREATE TABLE ks.users(userid uuid,name text,PRIMARY KEY((userid)));
Limitation :
UDT, collection not supported
Only support and distinguish these data type long,boolean,double,float,int,short,byte,ByteBuffer,InetAddress,String,Date,UUID,LocalDate,Duration
From the answer of Ashraful Islam, I have made a functional version in case someone is interested (#Ashraful Islam please feel free to add it to your answer if you prefer).
I also have added the support to ZonedDateTime following the recommendations of Datastax to use a type tuple<timestamp,varchar> (see their documentation).
import com.datastax.driver.core.*;
import com.datastax.driver.mapping.MappedProperty;
import com.datastax.driver.mapping.MappingConfiguration;
import com.datastax.driver.mapping.annotations.Table;
import com.google.common.collect.ImmutableMap;
import java.net.InetAddress;
import java.nio.ByteBuffer;
import java.time.ZonedDateTime;
import java.util.*;
import java.util.function.Predicate;
import java.util.stream.Collectors;
/**
* Inspired by Ashraful Islam
* https://stackoverflow.com/questions/44950245/generate-a-script-to-create-a-table-from-the-entity-definition/45039182#45039182
*/
public class CassandraScriptGeneratorFromEntities {
private static final Map<Class, DataType> BUILT_IN_CODECS_MAP = ImmutableMap.<Class, DataType>builder()
.put(Long.class, DataType.bigint())
.put(Boolean.class, DataType.cboolean())
.put(Double.class, DataType.cdouble())
.put(Float.class, DataType.cfloat())
.put(Integer.class, DataType.cint())
.put(Short.class, DataType.smallint())
.put(Byte.class, DataType.tinyint())
.put(long.class, DataType.bigint())
.put(boolean.class, DataType.cboolean())
.put(double.class, DataType.cdouble())
.put(float.class, DataType.cfloat())
.put(int.class, DataType.cint())
.put(short.class, DataType.smallint())
.put(byte.class, DataType.tinyint())
.put(ByteBuffer.class, DataType.blob())
.put(InetAddress.class, DataType.inet())
.put(String.class, DataType.text())
.put(Date.class, DataType.timestamp())
.put(UUID.class, DataType.uuid())
.put(LocalDate.class, DataType.date())
.put(Duration.class, DataType.duration())
.put(ZonedDateTime.class, TupleType.of(ProtocolVersion.NEWEST_SUPPORTED, CodecRegistry.DEFAULT_INSTANCE, DataType.timestamp(), DataType.text()))
.build();
private static final Predicate<List<?>> IS_NOT_EMPTY = ((Predicate<List<?>>) List::isEmpty).negate();
public static StringBuilder convertEntityToSchema(final Class<?> entityClass, final String defaultKeyspace, final long ttl) {
final Table table = Objects.requireNonNull(entityClass.getAnnotation(Table.class), () -> "The given entity " + entityClass + " is not annotated with #Table");
final String keyspace = Optional.of(table.keyspace())
.filter(((Predicate<String>) String::isEmpty).negate())
.orElse(defaultKeyspace);
final String ksName = table.caseSensitiveKeyspace() ? Metadata.quote(keyspace) : keyspace.toLowerCase(Locale.ROOT);
final String tableName = table.caseSensitiveTable() ? Metadata.quote(table.name()) : table.name().toLowerCase(Locale.ROOT);
final Set<? extends MappedProperty<?>> properties = MappingConfiguration.builder().build().getPropertyMapper().mapTable(entityClass);
final List<? extends MappedProperty<?>> partitionKeys = Optional.of(
properties.stream()
.filter(((Predicate<MappedProperty<?>>) MappedProperty::isComputed).negate())
.filter(MappedProperty::isPartitionKey)
.sorted(Comparator.comparingInt(MappedProperty::getPosition))
.collect(Collectors.toList())
).filter(IS_NOT_EMPTY).orElseThrow(() -> new IllegalArgumentException("No Partition Key define in the given entity"));
final List<MappedProperty<?>> clusteringColumns = properties.stream()
.filter(((Predicate<MappedProperty<?>>) MappedProperty::isComputed).negate())
.filter(MappedProperty::isClusteringColumn)
.sorted(Comparator.comparingInt(MappedProperty::getPosition))
.collect(Collectors.toList());
final List<MappedProperty<?>> otherColumns = properties.stream()
.filter(((Predicate<MappedProperty<?>>) MappedProperty::isComputed).negate())
.filter(((Predicate<MappedProperty<?>>) MappedProperty::isPartitionKey).negate())
.filter(((Predicate<MappedProperty<?>>) MappedProperty::isClusteringColumn).negate())
.sorted(Comparator.comparing(MappedProperty::getPropertyName))
.collect(Collectors.toList());
final StringBuilder query = new StringBuilder("CREATE TABLE IF NOT EXISTS ");
Optional.of(ksName).filter(((Predicate<String>) String::isEmpty).negate()).ifPresent(ks -> query.append(ks).append('.'));
query.append(tableName).append("(\n").append(toSchema(partitionKeys));
Optional.of(clusteringColumns).filter(IS_NOT_EMPTY).ifPresent(list -> query.append(",\n").append(toSchema(list)));
Optional.of(otherColumns).filter(IS_NOT_EMPTY).ifPresent(list -> query.append(",\n").append(toSchema(list)));
query.append(',').append("\nPRIMARY KEY(");
query.append('(').append(join(partitionKeys)).append(')');
Optional.of(clusteringColumns).filter(IS_NOT_EMPTY).ifPresent(list -> query.append(", ").append(join(list)));
query.append(')').append(") with default_time_to_live = ").append(ttl);
return query;
}
private static String toSchema(final List<? extends MappedProperty<?>> list) {
return list.stream()
.map(property -> property.getMappedName() + ' ' + BUILT_IN_CODECS_MAP.getOrDefault(property.getPropertyType().getRawType(), DataType.text()))
.collect(Collectors.joining(",\n"));
}
private static String join(final List<? extends MappedProperty<?>> list) {
return list.stream().map(MappedProperty::getMappedName).collect(Collectors.joining(", "));
}
In JDK 8 with lambda b93 there was a class java.util.stream.Streams.zip in b93 which could be used to zip streams (this is illustrated in the tutorial Exploring Java8 Lambdas. Part 1 by Dhananjay Nene). This function :
Creates a lazy and sequential combined Stream whose elements are the
result of combining the elements of two streams.
However in b98 this has disappeared. Infact the Streams class is not even accessible in java.util.stream in b98.
Has this functionality been moved, and if so how do I zip streams concisely using b98?
The application I have in mind is in this java implementation of Shen, where I replaced the zip functionality in the
static <T> boolean every(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred)
static <T> T find(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred)
functions with rather verbose code (which doesn't use functionality from b98).
I needed this as well so I just took the source code from b93 and put it in a "util" class. I had to modify it slightly to work with the current API.
For reference here's the working code (take it at your own risk...):
public static<A, B, C> Stream<C> zip(Stream<? extends A> a,
Stream<? extends B> b,
BiFunction<? super A, ? super B, ? extends C> zipper) {
Objects.requireNonNull(zipper);
Spliterator<? extends A> aSpliterator = Objects.requireNonNull(a).spliterator();
Spliterator<? extends B> bSpliterator = Objects.requireNonNull(b).spliterator();
// Zipping looses DISTINCT and SORTED characteristics
int characteristics = aSpliterator.characteristics() & bSpliterator.characteristics() &
~(Spliterator.DISTINCT | Spliterator.SORTED);
long zipSize = ((characteristics & Spliterator.SIZED) != 0)
? Math.min(aSpliterator.getExactSizeIfKnown(), bSpliterator.getExactSizeIfKnown())
: -1;
Iterator<A> aIterator = Spliterators.iterator(aSpliterator);
Iterator<B> bIterator = Spliterators.iterator(bSpliterator);
Iterator<C> cIterator = new Iterator<C>() {
#Override
public boolean hasNext() {
return aIterator.hasNext() && bIterator.hasNext();
}
#Override
public C next() {
return zipper.apply(aIterator.next(), bIterator.next());
}
};
Spliterator<C> split = Spliterators.spliterator(cIterator, zipSize, characteristics);
return (a.isParallel() || b.isParallel())
? StreamSupport.stream(split, true)
: StreamSupport.stream(split, false);
}
zip is one of the functions provided by the protonpack library.
Stream<String> streamA = Stream.of("A", "B", "C");
Stream<String> streamB = Stream.of("Apple", "Banana", "Carrot", "Doughnut");
List<String> zipped = StreamUtils.zip(streamA,
streamB,
(a, b) -> a + " is for " + b)
.collect(Collectors.toList());
assertThat(zipped,
contains("A is for Apple", "B is for Banana", "C is for Carrot"));
If you have Guava in your project, you can use the Streams.zip method (was added in Guava 21):
Returns a stream in which each element is the result of passing the corresponding element of each of streamA and streamB to function. The resulting stream will only be as long as the shorter of the two input streams; if one stream is longer, its extra elements will be ignored. The resulting stream is not efficiently splittable. This may harm parallel performance.
public class Streams {
...
public static <A, B, R> Stream<R> zip(Stream<A> streamA,
Stream<B> streamB, BiFunction<? super A, ? super B, R> function) {
...
}
}
Zipping two streams using JDK8 with lambda (gist).
public static <A, B, C> Stream<C> zip(Stream<A> streamA, Stream<B> streamB, BiFunction<A, B, C> zipper) {
final Iterator<A> iteratorA = streamA.iterator();
final Iterator<B> iteratorB = streamB.iterator();
final Iterator<C> iteratorC = new Iterator<C>() {
#Override
public boolean hasNext() {
return iteratorA.hasNext() && iteratorB.hasNext();
}
#Override
public C next() {
return zipper.apply(iteratorA.next(), iteratorB.next());
}
};
final boolean parallel = streamA.isParallel() || streamB.isParallel();
return iteratorToFiniteStream(iteratorC, parallel);
}
public static <T> Stream<T> iteratorToFiniteStream(Iterator<T> iterator, boolean parallel) {
final Iterable<T> iterable = () -> iterator;
return StreamSupport.stream(iterable.spliterator(), parallel);
}
Since I can't conceive any use of zipping on collections other than indexed ones (Lists) and I am a big fan of simplicity, this would be my solution:
<A,B,C> Stream<C> zipped(List<A> lista, List<B> listb, BiFunction<A,B,C> zipper){
int shortestLength = Math.min(lista.size(),listb.size());
return IntStream.range(0,shortestLength).mapToObj( i -> {
return zipper.apply(lista.get(i), listb.get(i));
});
}
The methods of the class you mentioned have been moved to the Stream interface itself in favor to the default methods. But it seems that the zip method has been removed. Maybe because it is not clear what the default behavior for different sized streams should be. But implementing the desired behavior is straight-forward:
static <T> boolean every(
Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) {
Iterator<T> it=c2.iterator();
return c1.stream().allMatch(x->!it.hasNext()||pred.test(x, it.next()));
}
static <T> T find(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) {
Iterator<T> it=c2.iterator();
return c1.stream().filter(x->it.hasNext()&&pred.test(x, it.next()))
.findFirst().orElse(null);
}
I humbly suggest this implementation. The resulting stream is truncated to the shorter of the two input streams.
public static <L, R, T> Stream<T> zip(Stream<L> leftStream, Stream<R> rightStream, BiFunction<L, R, T> combiner) {
Spliterator<L> lefts = leftStream.spliterator();
Spliterator<R> rights = rightStream.spliterator();
return StreamSupport.stream(new AbstractSpliterator<T>(Long.min(lefts.estimateSize(), rights.estimateSize()), lefts.characteristics() & rights.characteristics()) {
#Override
public boolean tryAdvance(Consumer<? super T> action) {
return lefts.tryAdvance(left->rights.tryAdvance(right->action.accept(combiner.apply(left, right))));
}
}, leftStream.isParallel() || rightStream.isParallel());
}
Using the latest Guava library (for the Streams class) you should be able to do
final Map<String, String> result =
Streams.zip(
collection1.stream(),
collection2.stream(),
AbstractMap.SimpleEntry::new)
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
The Lazy-Seq library provides zip functionality.
https://github.com/nurkiewicz/LazySeq
This library is heavily inspired by scala.collection.immutable.Stream and aims to provide immutable, thread-safe and easy to use lazy sequence implementation, possibly infinite.
Would this work for you? It's a short function, which lazily evaluates over the streams it's zipping, so you can supply it with infinite streams (it doesn't need to take the size of the streams being zipped).
If the streams are finite it stops as soon as one of the streams runs out of elements.
import java.util.Objects;
import java.util.function.BiFunction;
import java.util.stream.Stream;
class StreamUtils {
static <ARG1, ARG2, RESULT> Stream<RESULT> zip(
Stream<ARG1> s1,
Stream<ARG2> s2,
BiFunction<ARG1, ARG2, RESULT> combiner) {
final var i2 = s2.iterator();
return s1.map(x1 -> i2.hasNext() ? combiner.apply(x1, i2.next()) : null)
.takeWhile(Objects::nonNull);
}
}
Here is some unit test code (much longer than the code itself!)
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.BiFunction;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static org.junit.jupiter.api.Assertions.assertEquals;
class StreamUtilsTest {
#ParameterizedTest
#MethodSource("shouldZipTestCases")
<ARG1, ARG2, RESULT>
void shouldZip(
String testName,
Stream<ARG1> s1,
Stream<ARG2> s2,
BiFunction<ARG1, ARG2, RESULT> combiner,
Stream<RESULT> expected) {
var actual = StreamUtils.zip(s1, s2, combiner);
assertEquals(
expected.collect(Collectors.toList()),
actual.collect(Collectors.toList()),
testName);
}
private static Stream<Arguments> shouldZipTestCases() {
return Stream.of(
Arguments.of(
"Two empty streams",
Stream.empty(),
Stream.empty(),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"One singleton and one empty stream",
Stream.of(1),
Stream.empty(),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"One empty and one singleton stream",
Stream.empty(),
Stream.of(1),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.empty()),
Arguments.of(
"Two singleton streams",
Stream.of("blah"),
Stream.of(1),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("blah", 1))),
Arguments.of(
"One singleton, one multiple stream",
Stream.of("blob"),
Stream.of(2, 3),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("blob", 2))),
Arguments.of(
"One multiple, one singleton stream",
Stream.of("foo", "bar"),
Stream.of(4),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("foo", 4))),
Arguments.of(
"Two multiple streams",
Stream.of("nine", "eleven"),
Stream.of(10, 12),
(BiFunction<Object, Object, Object>) StreamUtilsTest::combine,
Stream.of(pair("nine", 10), pair("eleven", 12)))
);
}
private static List<Object> pair(Object o1, Object o2) {
return List.of(o1, o2);
}
static private <T1, T2> List<Object> combine(T1 o1, T2 o2) {
return List.of(o1, o2);
}
#Test
void shouldLazilyEvaluateInZip() {
final var a = new AtomicInteger();
final var b = new AtomicInteger();
final var zipped = StreamUtils.zip(
Stream.generate(a::incrementAndGet),
Stream.generate(b::decrementAndGet),
(xa, xb) -> xb + 3 * xa);
assertEquals(0, a.get(), "Should not have evaluated a at start");
assertEquals(0, b.get(), "Should not have evaluated b at start");
final var takeTwo = zipped.limit(2);
assertEquals(0, a.get(), "Should not have evaluated a at take");
assertEquals(0, b.get(), "Should not have evaluated b at take");
final var list = takeTwo.collect(Collectors.toList());
assertEquals(2, a.get(), "Should have evaluated a after collect");
assertEquals(-2, b.get(), "Should have evaluated b after collect");
assertEquals(List.of(2, 4), list);
}
}
public class Tuple<S,T> {
private final S object1;
private final T object2;
public Tuple(S object1, T object2) {
this.object1 = object1;
this.object2 = object2;
}
public S getObject1() {
return object1;
}
public T getObject2() {
return object2;
}
}
public class StreamUtils {
private StreamUtils() {
}
public static <T> Stream<Tuple<Integer,T>> zipWithIndex(Stream<T> stream) {
Stream<Integer> integerStream = IntStream.range(0, Integer.MAX_VALUE).boxed();
Iterator<Integer> integerIterator = integerStream.iterator();
return stream.map(x -> new Tuple<>(integerIterator.next(), x));
}
}
AOL's cyclops-react, to which I contribute, also provides zipping functionality, both via an extended Stream implementation, that also implements the reactive-streams interface ReactiveSeq, and via StreamUtils that offers much of the same functionality via static methods to standard Java Streams.
List<Tuple2<Integer,Integer>> list = ReactiveSeq.of(1,2,3,4,5,6)
.zip(Stream.of(100,200,300,400));
List<Tuple2<Integer,Integer>> list = StreamUtils.zip(Stream.of(1,2,3,4,5,6),
Stream.of(100,200,300,400));
It also offers more generalized Applicative based zipping. E.g.
ReactiveSeq.of("a","b","c")
.ap3(this::concat)
.ap(of("1","2","3"))
.ap(of(".","?","!"))
.toList();
//List("a1.","b2?","c3!");
private String concat(String a, String b, String c){
return a+b+c;
}
And even the ability to pair every item in one stream with every item in another
ReactiveSeq.of("a","b","c")
.forEach2(str->Stream.of(str+"!","2"), a->b->a+"_"+b);
//ReactiveSeq("a_a!","a_2","b_b!","b_2","c_c!","c2")
If anyone needs this yet, there is StreamEx.zipWith function in streamex library:
StreamEx<String> givenNames = StreamEx.of("Leo", "Fyodor")
StreamEx<String> familyNames = StreamEx.of("Tolstoy", "Dostoevsky")
StreamEx<String> fullNames = givenNames.zipWith(familyNames, (gn, fn) -> gn + " " + fn);
fullNames.forEach(System.out::println); // prints: "Leo Tolstoy\nFyodor Dostoevsky\n"
This is great. I had to zip two streams into a Map with one stream being the key and other being the value
Stream<String> streamA = Stream.of("A", "B", "C");
Stream<String> streamB = Stream.of("Apple", "Banana", "Carrot", "Doughnut");
final Stream<Map.Entry<String, String>> s = StreamUtils.zip(streamA,
streamB,
(a, b) -> {
final Map.Entry<String, String> entry = new AbstractMap.SimpleEntry<String, String>(a, b);
return entry;
});
System.out.println(s.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue())));
Output:
{A=Apple, B=Banana, C=Carrot}
I try to understand how the lucene query syntax works so I wrote this small program.
When using a NumericRangeQuery I can find the documents I want but when trying to parse a search condition, it can't find any hits, although I'm using the same conditions.
i understand the difference can be explained by the analyzer but the StandardAnalyzer is used which does not remove numeric values.
Can someone tell me what I'm doing wrong ?
Thanks.
package org.burre.lucene.matching;
import java.io.IOException;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.*;
import org.apache.lucene.index.*;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.NumericRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.*;
import org.apache.lucene.util.Version;
public class SmallestEngine {
private static final Version VERSION=Version.LUCENE_48;
private StandardAnalyzer analyzer = new StandardAnalyzer(VERSION);
private Directory index = new RAMDirectory();
private Document buildDoc(String name, int beds) {
Document doc = new Document();
doc.add(new StringField("name", name, Field.Store.YES));
doc.add(new IntField("beds", beds, Field.Store.YES));
return doc;
}
public void buildSearchEngine() throws IOException {
IndexWriterConfig config = new IndexWriterConfig(VERSION,
analyzer);
IndexWriter w = new IndexWriter(index, config);
// Generate 10 houses with 0 to 3 beds
for (int i=0;i<10;i++)
w.addDocument(buildDoc("house"+(100+i),i % 4));
w.close();
}
/**
* Execute the query and show the result
*/
public void search(Query q) throws IOException {
System.out.println("executing query\""+q+"\"");
IndexReader reader = DirectoryReader.open(index);
try {
IndexSearcher searcher = new IndexSearcher(reader);
ScoreDoc[] hits = searcher.search(q, 10).scoreDocs;
System.out.println("Found " + hits.length + " hits.");
for (int i = 0; i < hits.length; ++i) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
System.out.println(""+(i+1)+". " + d.get("name") + ", beds:"
+ d.get("beds"));
}
} finally {
if (reader != null)
reader.close();
}
}
public static void main(String[] args) throws IOException, ParseException {
SmallestEngine me = new SmallestEngine();
me.buildSearchEngine();
System.out.println("SearchByRange");
me.search(NumericRangeQuery.newIntRange("beds", 3, 3,true,true));
System.out.println("-----------------");
System.out.println("SearchName");
me.search(new QueryParser(VERSION,"name",me.analyzer).parse("house107"));
System.out.println("-----------------");
System.out.println("Search3Beds");
me.search(new QueryParser(VERSION,"beds",me.analyzer).parse("3"));
System.out.println("-----------------");
System.out.println("Search3BedsInRange");
me.search(new QueryParser(VERSION,"name",me.analyzer).parse("beds:[3 TO 3]"));
}
}
The output of this program is:
SearchByRange
executing query"beds:[3 TO 3]"
Found 2 hits.
1. house103, beds:3
2. house107, beds:3
-----------------
SearchName
executing query"name:house107"
Found 1 hits.
1. house107, beds:3
-----------------
Search3Beds
executing query"beds:3"
Found 0 hits.
-----------------
Search3BedsInRange
executing query"beds:[3 TO 3]"
Found 0 hits.
You need to use NumericRangeQuery to perform a search on the numeric field.
The answer here could give you some insight.
Also the answer here says
for numeric values (longs, dates, floats, etc.) you need to have NumericRangeQuery. Otherwise Lucene has no idea how do you want to define similarity.
What you need to do is to write your own QueryParser:
public class CustomQueryParser extends QueryParser {
// ctor omitted
#Override
public Query newTermQuery(Term term) {
if (term.field().equals("beds")) {
// manually construct and return non-range query for numeric value
} else {
return super.newTermQuery(term);
}
}
#Override
public Query newRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) {
if (field.equals("beds")) {
// manually construct and return range query for numeric value
} else {
return super.newRangeQuery(field, part1, part2, startInclusive, endInclusive);
}
}
}
It seems like you always have to use the NumericRangeQuery for numeric conditions. (thanks to Mindas) so as he suggested I created My own more intelligent QueryParser.
Using the Apache commons-lang function StringUtils.isNumeric() I can create a more generic QueryParser:
public class IntelligentQueryParser extends QueryParser {
// take over super constructors
#Override
protected org.apache.lucene.search.Query newRangeQuery(String field,
String part1, String part2, boolean part1Inclusive, boolean part2Inclusive) {
if(StringUtils.isNumeric(part1))
{
return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1),Integer.parseInt(part2),part1Inclusive,part2Inclusive);
}
return super.newRangeQuery(field, part1, part2, part1Inclusive, part2Inclusive);
}
#Override
protected org.apache.lucene.search.Query newTermQuery(
org.apache.lucene.index.Term term) {
if(StringUtils.isNumeric(term.text()))
{
return NumericRangeQuery.newIntRange(term.field(), Integer.parseInt(term.text()),Integer.parseInt(term.text()),true,true);
}
return super.newTermQuery(term);
}
}
Just wanted to share this.