Remove sections from a text file not containing certain strings - linux

So I have a text document with information associated with emails. and I have another one with a list of emails.
Now I want to basically check to see if an email matches one in a field of document one(separated by the "====") and If it does contain an email from document two, then outputs/saves that specific field (with the data).
So for example,
document A:
===================
JohnDoe#gmail.com
Tall man
Black hair
Blue eyes
===================
====================
jackandjones#gmail.com
Small man
Black hair
green eyes
=====================
=====================
janedoe#gmail.com
Tall women
Ginger hair
Blue eyes
=====================
Document two:
Johndoe#gmail.com
bobdylan#gmail.com
Janedoe#gmail.com
Desired output:
===================
JohnDoe#gmail.com
Tall man
Black hair
Blue eyes
===================
=====================
janedoe#gmail.com
Tall women
Ginger hair
Blue eyes
=====================
Sorry if I am not explaining this well, What jumps to mind is using cut command but I can't get my head around getting my desired output, could anyone give me a nudge?

assuming GNU awk with the RS support of multi-characters AND RT
gawk '
FNR==NR { f1[tolower($1)];next}
!(FNR%2) && tolower($2) in f1 { print RT $0 RT }
' DocumentTwo RS='[=]+' FS='\n' DocumentA
yields:
===================
JohnDoe#gmail.com
Tall man
Black hair
Blue eyes
===================
=====================
janedoe#gmail.com
Tall women
Ginger hair
Blue eyes
=====================

Related

How to print text with multiple different types of colors in he same line in Python 3?

I am trying to do some visualizing data on the terminal and I am doing lots of printing to do that. The issue I am having is that certain character symbols look the same. I figured that coloring them differently would help me see the differences. I see from this link that there is a way to do it, but I don't understand what I am looking at. There is no explanation for what is going on in those solutions.
How do you specify, in the same line, text with different colors?
I should also mention that I am building the printable string OUTSIDE of the call to print(). How do you build the colorful string outside of the print() call?
A rewritten form of my question:
1. Colorize the text I print to the string with multiple types of colors.
2. Colorize the string BEFORE it gets sent to the "print()" call.
A couple examples would be great.
Using colorama just like the answer you linked is doing:
from colorama import Fore, Style
my_str = f"{Fore.BLUE}Hello, {Style.RESET_ALL} guys. {Fore.RED} I should be red."
print(my_str)
This gives me:
As you can see Fore.<color name> changes the color of the text after it, until the Style.RESET_ALL. After that you can change the color of the text again.
There could be multiple ways to achieve this. One which doesn't require any extra packages is to use ANSI color codes. Look at this link. Below are some examples.
s = "\033[1;32;40m Bright Green on black \033[1;31;43m Red on yellow \033[1;34;42m Blue on green \033[1;37;40m"
print(s)
Here in first code \033[1;32;40m, \033[ is the escape code followed by 1 for bold, 32 for bright green text and 40 for black background. The 3 codes are separated by ; and ended with m. Adding all the 3 codes (1, 32 and 40 here) isn't mandatory though.
output:
Other ways to achieve this can be found here.

Alternatives for red/green in diff output

I once read an article arguing that red and green are bad choices for diff, because
some people have red-green color blindness
red implies "bad" and green implies "good", but deleted code is often not bad and new code is also not always good.
However, I cannot remember where I found that article and which alternative colors were suggested.
What would be sensible alternative colors for red/green?
Since it's still here - I'd go with blue and red. Blue is more neutral, and red at least has a connotation of "deleted".

Extracting properties and attributes for entities

I want to extract attributes and their values for name-entities. For example:
Lisa has a pet cat named Whiskers. Whiskers is black with a white spot on her chest. Whiskers also has white paws that look like little white mittens. Whiskers likes to sleep in the sun on her favorite chair. Whiskers also likes to drink creamy milk.
One possible extraction of attributes for each entity is the following:
List:
Has -> Whiskers
Wiskers
Color -> Black
Likes to -> {Sleep in the sun on Lisa's favorite chair, drink creamy mik}
You could search for phrase structures the correspond to the relationships you want to extract. For example, you could find all the phrases of the form Noun-phrase verp-phrase noun-phrase and turn them into subject-predicate-object tuples. The more specific your sentence patterns are, the better this is likely to work. The pattern Python library makes this pretty easy to do.

Search for the word and exporting 35characters after that word using shell script?

I have a file input.txt which have loads of weird characters, html tags and useful materials. I want to display 35 characters after the word description excluding weird characters like $$#$##$##***$# and without html tags in the new file output.txt. Help me.
Thanx in advance.
My final goal is to find the word description and print 35 characters after description which shouldn't include the html tags and weird characters. Is it possible? Like here:
<description><p><img class="float_right"
src="http://static3.businessinsider.com/image/502ab0036bb3f7147b00000f-400-300/dnu.jpg"
border="0" alt="dnu" width="400" height="300" /></p><p>The lawn
was filled with <a class="hidden_link"
href="http://www.businessinsider.com/blackboard/goldman-sachs">Goldman
Sachs</a> Group Inc. partners dressed in pink looking out on a pink sunset.
I want to start from: The lawn is filled with (again skip those tags and continue from) Group Inc. partners (35 characters .done!) and then stop and search for another description!
You can select all the text within an HTML node using XPath. In your case this should work:
xpath -q -e '//description//text()' input.txt
The query //description//text() works as follows:
//description: drill down the HTML document till you find a node named description
//text(): within this node drill down all other nodes and select their text
Given your data this outputs:
The lawn was filled with
Goldman Sachs
Group Inc. partners dressed in pink looking out on a pink sunset.

Usability for notification messages, colors [web apps]

In each web app I develop, I like to add three types of messages:
Green/blue for success messages
Yellow for warnings
Red for errors
And perhaps, a neutral one for information, which is gray or blue if the success one is green.
The success one is used for when an item is created or updated, the yellow one is when there's something wrong, but not we-are-going-to-die wrong and the red one is when something is blocked or we are going to die.
However, there's one thing I can't figure out, when I delete an object, what kind of notification should I use? I think the success one is not because it is not expected, altough the deletion was successful, the user tends not to read the message, just to see the color.
The red one might be, but it can be misunderstood (I tried to delete it but there was an error), the warning and the information one might be good choices, but I'm not really sure.
Also, when you ask for confirmation about deleting something, the 'cancel' button should be green or red?
I'm just curious how you guys handle this. Thanks.
In general, I rely on the OS to provide appropriate colors.
The problem is with vision-impaired users. I can't predict whether or not they can read text set against any background I might choose. I assume that they've configured their browser and OS to display the colors that they can read the best.
Mike brings up a good point. Using colors assumes the user can see colors. Perhaps adding an icon (with contrasting foreground and background colors) to your messages may help with the ambiguousness.
For example:
Exclamation: Exclamation point in a triangle with a yellow background.
Asterisk: Lowercase letter i in a talk bubble.
Stop: White X in a circle with a red background.
Error: White X in a circle with a red background.
Warning: Exclamation point in a triangle with a yellow background.
Information: Lowercase letter i in a comic bubble.
Question: Blue question mark in a talk bubble.
I usually do it the following way:
If the user's intention was to delete something, and he did, I show it in green. If they don't read it because its green, they will assume that whatever they wanted to do has been done correctly.
At the "are you sure?" stage, the user may have gotten there by accident, so if you give a color, he may get confused or scared. I keep the "delete/cancel" buttons in a neutral color.

Resources