Sink component doesn't get the right data with kafka in spring cloud data flow - spring-integration

I am not a native English speaker but I try to express my question as clear as possible.
I encountered this problem which has confused me for two days and I still can't find the solution.
I have built a stream which will run in the Spring Could Data Flow in the Hadoop YARN.
The stream is composed of Http source,processor and file sink.
1.Http Source
The HTTP Source component has two output channels binding with two different destinations which are dest1 and dest2 defined in the application.properties.
spring.cloud.stream.bindings.output.destination=dest1
spring.cloud.stream.bindings.output2.destination=dest2
Below is the code snipet for HTTP source for your reference..
#Autowired
private EssSource channels; //EssSource is the interface for multiple output channels
##output channel 1:
#RequestMapping(path = "/file", method = POST, consumes = {"text/*", "application/json"})
#ResponseStatus(HttpStatus.ACCEPTED)
public void handleRequest(#RequestBody byte[] body, #RequestHeader(HttpHeaders.CONTENT_TYPE) Object contentType) {
logger.info("enter ... handleRequest1...");
channels.output().send(MessageBuilder.createMessage(body,
new MessageHeaders(Collections.singletonMap(MessageHeaders.CONTENT_TYPE, contentType))));
}
##output channel 2:
#RequestMapping(path = "/test", method = POST, consumes = {"text/*", "application/json"})
#ResponseStatus(HttpStatus.ACCEPTED)
public void handleRequest2(#RequestBody byte[] body, #RequestHeader(HttpHeaders.CONTENT_TYPE) Object contentType) {
logger.info("enter ... handleRequest2...");
channels.output2().send(MessageBuilder.createMessage(body,
new MessageHeaders(Collections.singletonMap(MessageHeaders.CONTENT_TYPE, contentType))));
}
2. Processor
The processor has two multiple input channels and two output channels binding with different destinations.
The destination binding is defined in application.properties in processor component project.
//input channel binding
spring.cloud.stream.bindings.input.destination=dest1
spring.cloud.stream.bindings.input2.destination=dest2
//output channel binding
spring.cloud.stream.bindings.output.destination=hdfsSink
spring.cloud.stream.bindings.output2.destination=fileSink
Below is the code snippet for Processor.
#Transformer(inputChannel = EssProcessor.INPUT, outputChannel = EssProcessor.OUTPUT)
public Object transform(Message<?> message) {
logger.info("enter ...transform...");
return "processed by transform1";;
}
#Transformer(inputChannel = EssProcessor.INPUT_2, outputChannel = EssProcessor.OUTPUT_2)
public Object transform2(Message<?> message) {
logger.info("enter ... transform2...");
return "processed by transform2";
}
3. The file sink component.
I use the official fil sink component from Spring.
maven://org.springframework.cloud.stream.app:file-sink-kafka:1.0.0.BUILD-SNAPSHOT
And I just add the destination binding in its applicaiton.properties file.
spring.cloud.stream.bindings.input.destination=fileSink
4.Finding:
The data flow I expected should like this:
Source.handleRequest() -->Processor.handleRequest()
Source.handleRequest2() -->Processor.handleRequest2() --> Sink.fileWritingMessageHandler();
Should only the string "processed by transform2" is saved to the file.
But after my testing, the data flow is actual like this:
Source.handleRequest() -->Processor.handleRequest() --> Sink.fileWritingMessageHandler();
Source.handleRequest2() -->Processor.handleRequest2() --> Sink.fileWritingMessageHandler();
Both the "processed by transform1" and "processed by transform2" string are saved to the file.
5.Question:
Although the destination for the output channel in Processor.handleRequest() binds to hdfsSink instead of fileSink,the data still flows to file Sink. I can't understand this and this is not what I want.
I only want the data from Processor.handleRequest2() flows to file sink instead of both.
If I don't do it right, could anyone tell me how to do it and what is the solution?
It has been confused me for 2 days.
Thanks you for your kindly help.
Alex

Is your stream definition something like this (where the '-2' versions are the ones with multiple channels) ?
http-source-2 | processor-2 | file-sink
Note that Spring Cloud Data Flow will override the destinations defined in applications.properties which is why, even if spring.cloud.stream.bindings.output.destination for the processor is set to hdfs-sink, it will actually match the input of file-sink.
The way destinations are configured from a stream definition is explained here (in the context of taps): http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-stream-tap-dsl
What you can do is to simply swap the meaning of channel 1 and 2 - use the side channel for hdfs. This is a bit brittle though - as the input/output channels of the Stream will be configured automatically and the other channels will be configured via application.properties - in this case it may be better to configure the side channel destinations via stream definition or at deployment time - see http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_application_properties.
It seems to me that these could be just as well be 2 streams listening to separate endpoints, using regular components - given that data is supposed to be flowing side by side.

Related

Spring-Content: Moving files from content store to another content store

What I want: I'd like to move content from one ContentStore (regular) to another ContentStore (e.g. an archive) with Spring-Content version 1.2.7.
What I did is this (and it does work at least with DefaultFilesystemStoreImpls):
Creating two ContentStores like this:
#Bean(name = "mytmpfsstore1")
public ContentStore<File, String> getFileContentStore1() {
FileSystemResourceLoader loader = new FileSystemResourceLoader(".\\tmpstore1");
PlacementService placementService = new PlacementServiceImpl();
return new DefaultFilesystemStoreImpl<File, String>(loader, placementService, new FileServiceImpl());
}
Moving content from one ContentStore to another like this:
Optional<File> fileEntity = filesRepo.findById(id);
if (fileEntity.isPresent()) {
Resource resource = regularContentStore.getResource(fileEntity.get());
archiveContentStore.setContent(fileEntity.get(), resource);
filesRepo.save(fileEntity.get());
if (resource instanceof DeletableResource) {
((DeletableResource) resource).delete();
}
}
Question: Is this the intended way of moving (/archiving) content with Spring-Content or is there a more elegant / more convenient / more intended way of moving/archiving files (especially from filesystem to S3 and back again)?
Spring Content doesn't provide any magic here. I try to keep the API fairly low-level and this use case, whilst valid, is a little too granular.
So, ultimately you have to build the "archiving" yourself and at the end of the day copy the content from one store to another as you are doing.
A couple of comments/pointers:
Not sure why you instantiated mytmpfsstore1 yourself. If you have file storage and s3 storage you can just make your storage interfaces extend FileSystemContentStore and S3ContentStore respectively and let the framework instantiate them for you; i.e.
public interface MyTmpFsStore extends FileSystemContentStore {}
public interface MyS3store extends S3ContentStore()
You can then wire these two beans into your 'archive' code. Assuming that is a controller, something like:
#RequestMapping(...) public void archive(MyTmpFsStore fsStore, S3ContentStore s3Store { ... }
where you do your copy operation.
ContentStore extends AssociativeStore and there are a few different APIs for setting/unsetting content. Resource-based or InputStream. They all achieve the same objective at the end of the day. You could just have easily used getContent instead of getResource and unsetContent instead of delete.
The 'archive' code probably needs to be in a #Transactional so the archive operation is atomic.

Speaking with an ISpVoice from a ISpTTSEngine

I'm implementing an ISpTTSEngine for the Microsoft Speech API (SAPI). I'd like for
this voice to annunciate just like a typical TTS voice. Rather than write my
own speech synthesizer, I'd like to delegate to a built-in ISpVoice.
I've written enough code to hear text vocalized, but it has a major deficiency
that I haven't been able to explain: the speech does not begin until after my
implementation of ISpTTSEngine:Speak has returned. For the duration of the
audible output, my implementation of ISpTTSEngine:Speak is not invoked, even
when the software using the TTS voice is sending requests.
(For context: my goal for this project is to programmatically observe the speech data that other pieces
of software are attempting to vocalize. That part appears to be working as
intended.)
The full source is available
here. I'll try to
summarize with the most relevant parts.
My implementation of ISpTTSEngine has a private member named
m_cpVoice:
class ATL_NO_VTABLE CTTSEngObj :
public CComObjectRootEx<CComMultiThreadModel>,
public CComCoClass<CTTSEngObj, &CLSID_SampleTTSEngine>,
public ISpTTSEngine,
public ISpObjectWithToken
{
// ...
private:
CComPtr<ISpVoice> m_cpVoice;
And it is initialized in the FinalConstruct
method:
HRESULT CTTSEngObj::FinalConstruct()
{
HRESULT hr = S_OK;
// ...
hr = m_cpVoice.CoCreateInstance(CLSID_SpVoice);
My implementation of ISpTTSEngine:Speak iterates over the text fragments it
receives
and passes the text data to the ISpVoice::Speak
method:
STDMETHODIMP CTTSEngObj::Speak(DWORD dwSpeakFlags,
REFGUID rguidFormatId,
const WAVEFORMATEX* pWaveFormatEx,
const SPVTEXTFRAG* pTextFragList,
ISpTTSEngineSite* pOutputSite)
{
// ...
for (const SPVTEXTFRAG* textFrag = pTextFragList; textFrag != NULL; textFrag = textFrag->pNext)
{
// ...
const std::wstring& text = textFrag->pTextStart;
hr = m_cpVoice->Speak(text.substr(0, textFrag->ulTextLen).c_str(), dwSpeakFlags | SPF_ASYNC | SPF_PURGEBEFORESPEAK, 0);
As mentioned above, no audio is emitted until after ISpTTSEngine:Speak
returns. An arbitrary sleep statement demonstrates this most clearly. Polling
the ISpVoice's SpeakCompleteEvent handle inevitably times out. Removing the
SPF_ASYNC flag from the invocation of ISpVoice::Speak causes the caller to
crash.
Can anyone explain this behavior? Or suggest a change that would allow me to
observe subsequent speech requests?
SAPI isn't expecting to be entered recursively. Consider using a different TTS engine (e.g., the WinRT System.Media.SpeechSynthesis APIs) to do the actual synthesis. The text fragments won't have any embedded markup, so that won't be a big deal.

Spring Integration: Switch routing dynamically

A spring integration based converter consumes the messages from one system, checks, converts and sends it to the other one.
Should the target system be down, we stop the inbound adapters, but would also like to persist locally or forward the currently "in-flight" converted messages. For that would simply like to reroute the messages from the normal output channel to some "backup"-channel dynamically.
In the docs I have found only the option to route the messages based on their headers ( so on some step before in flow I would have to add those dynamically once the targer system is not availbale), or based on the payload type, which is not really my case. The case with adding dynamically some header, and then filtering it out down the pipe, or during de-/serializing still seems not the best approach for me. I would like rather to be able to turn a switch(on some internal Event) that would then reroute those "in-flight" messages to the "backup"-channel.
What would be a best SI approach to achive this? Thanks!
The router could not only be based on the the payload type or some header. You really can have a general POJO method invocation to return a channel, its name or some routing key which is mapped. That POJO method indeed can check some internal system state and produce this or that routing key.
So, you may have something like this in the router configuration:
.route(myRouter())
where your myRouter is something like this:
#Bean
MyRouter myRouter() {
return;
}
and its internal code might be like this:
public class MyRouter {
#Autowired
private SystemState systemState;
String route(Object payload) {
return this.systemState.isActive() ? "successChannel" : "backupChannel";
}
}
The same can be achieved a simple lambda definition:
.<Object, Boolean>route(p -> systemState().isActive(),
m -> m.channelMapping(true, "sucessChannel")
.channelMapping(false, "backupChannel"))
Also...
private final AtomicBoolean switcher = new AtomicBoolean();
#Bean
public IntegrationFlow flow() {
return IntegrationFlows.from(() -> "foo", e -> e.poller(Pollers.fixedDelay(Duration.ofSeconds(5))))
.route(s -> switcher.get() ? "foo" : "bar")
.get();
}

Spring Integrtion XML and Java Config Conversion

I am very new to Spring Integration and my project is using File Support to read a file and load into data base.
I have XML config , trying to understand it's content.
<int-file:inbound-channel-adapter auto-startup= true channel="channelOne" directory="${xx}" filename-regex="${xx}" id="id" prevent-duplicates="false">
<int:poller fixed-delay="1000" receive-timeout="5000"/>
</int-file:inbound-channel-adapter>
<int:channel id="channelOne"/>
From the above piece, my understanding is :
We define a channel and
Then define inbound-channel-adapter - this will look into directory for the file and create a message with file as a payload.
I was able to convert this in JavaConfig as below :
#Bean
public MessageChannel fileInputChannel() {
return new DirectChannel();
}
#Bean
#InboundChannelAdapter(value = "fileInputChannel", poller = #Poller(fixedDelay = "1000"))
public MessageSource<File> fileReadingMessageSource() {
FileReadingMessageSource sourceReader= new FileReadingMessageSource();
RegexPatternFileListFilter regexPatternFileListFilter = new RegexPatternFileListFilter(
file-regex);
//List<FileListFilter<File>> fileListFilter = new ArrayList<FileListFilter<File>>();
fileListFilter.add(regexPatternFileListFilter);
//CompositeFileListFilter compositeFileListFilter = new CompositeFileListFilter<File>(
fileListFilter);
sourceReader.setDirectory(new File(inputDirectorywhereFileComes));
sourceReader.setFilter(regexPatternFileListFilter );
return sourceReader;
}
Then the next piece of code , which literally I am struggling to understand and moreover to convert to JavaConfig.
Here is the next piece:
<int-file:outbound-gateway
delete-source-files="true"
directory="file:${pp}"
id="id"
reply-channel="channelTwo"
request-channel="channelOne"
temporary-file-suffix=".tmp"/>
<int:channel id="channelTwo"/>
<int:outbound-channel-adapter channel="channelTwo" id="id" method="load" ref="beanClass"/>
So from this piece , my understanding :
1: Define an output channel.
2: Define an outbound-gateway, which will write that message as a file again in directory(other one), also remove file from source directory. And finally it will call the method Load of Bean Class. This is our class and has load method which takes file as input and load it to DB.
I tried to covert it into Java Config. Here is my code:
#Bean
#ServiceActivator(inputChannel= "fileInputChannel")
public MessageHandler fileWritingMessageHandler() throws IOException, ParseException {
FileWritingMessageHandler handler = new FileWritingMessageHandler(new File(path to output directory));
handler.setFileExistsMode(FileExistsMode.REPLACE);
beaObject.load(new File(path to output directory or input directory:: Nothing Worked));
handler.setDeleteSourceFiles(true);
handler.setOutputChannel(fileOutputChannel());
return handler;
}
I am able to write this file to output folder also was able to delete from source. After that I am totally lost. I have to call method Load of my BeanClass(ref=class in XML ).
I tried a lot, but not able to get it. Read multiple times the integration File Support doc, but couldn't make it.
Note: When I tried , I got one error saying , the File Not Found Exception. I believe , I am able to call my method , but can not get the file.
This XML config is working perfectly fine.
Spring Integration with DSL also anyone can suggest, if possible.
Please help me to understand the basic flow and get this thing done. Any help and comments is really appreciable.
Thanks in advance.
First of all you need to understand that #Bean method is exactly for configuration and components definitions which are going to be used later at runtime. You definitely must not call a business logic in the #Bean. I mean that your beaObject.load() is totally wrong.
So, please, go first to Spring Framework Docs to understand what is #Bean and its parent #Configuration: https://docs.spring.io/spring/docs/5.1.2.RELEASE/spring-framework-reference/core.html#beans-java
Your #ServiceActivator for the FileWritingMessageHandler is really correct (when you remove that beaObject.load()). What you just need is to declare one more #ServiceActivator for calling your beaObject.load() at runtime when message appears in the fileOutputChannel:
#ServiceActivator(inputChannel= "fileOutputChannel")
public void loadFileIntoDb(File payload) {
this.beaObject.load(payload);
}
See https://docs.spring.io/spring-integration/docs/5.1.1.BUILD-SNAPSHOT/reference/html/configuration.html#annotations for more info.

Name an external dependency in Zipkin to have it drawn

I am using Zipkin with Spring Sleuth to display traces. When I use it locally, http://localhost:9411/zipkin/dependency/ displays a nicely created graph of dependencies within the eco-system. Sometimes, backends from outside that eco-system get called and those are not displayed in that graph. Is it possible to annotate a call (let's assume RestTemplate and Feign clients) to such an external system so Zipkin would actually draw that dependency? If it's possible, what do I have to do?
This would be my baseline of code:
#Bean
RestTemplate restTemplate() {
return new RestTemplate();
}
#RequestMapping("/")
public String callExternalBackend() {
return restTemplate.getForObject("https://httpbin.org/get", String.class);
}
Somewhere I would like to type httpbin so this call gets drawn in the dependency-graph of Zipkin.
Thank you!
// Edit based on current solution
I'm using Spring Cloud Finchley and added the following line before restTemplate's call:
#RequestMapping("/")
public String callBackend() {
spanCustomizer.tag("peer.service", "httpbin");
return restTemplate.getForObject("https://httpbin.org/get", String.class);
}
I simply inject SpanCustomizer in this class. The Span is sent to Zipkin and I see the tag is set:
Unfortunately, it is not drawn in the dependencies-view. Is there anything else I need to configure, maybe in Zipkin rather than in Sleuth?
EDGWARE
Have you read the documentation? If you use Spring Cloud Sleuth in Edgware version if you read the Sleuth section you would find this piece of the documentation https://cloud.spring.io/spring-cloud-static/Edgware.SR3/single/spring-cloud.html#_custom_sa_tag_in_zipkin
Let me copy that for you
54.5 Custom SA tag in Zipkin Sometimes you want to create a manual Span that will wrap a call to an external service which is not
instrumented. What you can do is to create a span with the
peer.service tag that will contain a value of the service that you
want to call. Below you can see an example of a call to Redis that is
wrapped in such a span.
org.springframework.cloud.sleuth.Span newSpan = tracer.createSpan("redis");
try {
newSpan.tag("redis.op", "get");
newSpan.tag("lc", "redis");
newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_SEND);
// call redis service e.g
// return (SomeObj) redisTemplate.opsForHash().get("MYHASH", someObjKey);
} finally {
newSpan.tag("peer.service", "redisService");
newSpan.tag("peer.ipv4", "1.2.3.4");
newSpan.tag("peer.port", "1234");
newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_RECV);
tracer.close(newSpan);
}
[Important] Important Remember not to add both peer.service tag and
the SA tag! You have to add only peer.service.
FINCHLEY
The SA tag will not work for Finchley. You have to do it in the following manner using the remoteEndpoint on the span.
Span span = tracer.newTrace().name("redis");
span.remoteEndpoint(Endpoint.newBuilder().serviceName("redis").build());
span.kind(CLIENT);
try(SpanInScope ws = tracer.withSpanInScope(span.start())) {
// add any tags / annotations on the span
// return (SomeObj) redisTemplate.opsForHash().get("MYHASH", someObjKey);
} finally {
span.finish();
}

Resources