Why Eclipse JDT does not have a global symbol search - eclipse-jdt

In the CDT there is an "Open Element" to search for global symbols, but not in JDT.
only uses "Java search" to search, obviously not very convenient, why JDT does not provide a function like this?

Anywhere in Eclipse you can use the general File Search to search for words regardless their position in text. This search can be limited to *.java files; also the Whole word option may be relevant for this question.
If you want more precise search results, JDT offers language-aware search, but for this added precision you need to specify the kind of symbol you are interested in (Search For). Without specifying the kind, search would be very similar to plain text search.
Both CDT and JDT use an index for search. The CDT index is said to be faster, because it is more complete, whereas JDT search needs to operate in two phases: index based match candidates plus exact matching using resolved AST. In fact, efforts have started, to port the concept of the more complete CDT index also to JDT for improved search speed. As of Oxygen, however, this effort has not been completed.
Anyone seeing substantial benefit in allowing to search for more than one kind at a time is invited to chime in at Bug 221081.

Related

Exclude comments from search results in IntelliJ global search?

I found the grammar error "it's" as a possessive on one page of a large project. I'm trying to search for any other usages of this on pages to correct it, but I'm getting results containing hundreds of comments. I just want to filter for the important user-facing portions of the project. Is there a way to exclude comments from the results of a global search?
In more recent versions, at least in PyCharm 2018 (similar to IntelliJ), there is a filter option "Except comments," as shown here:
(Click the small filter icon to show the dropdown.)
Note: The selected filter option persists during a session, and the active filter option is not immediately apparent unless you open the dropdown. To prevent accidentally limiting subsequent searches, it may be a good idea to switch back to "Anywhere" afterward.
Another approach would be to enable the "Regular expression" (or "Regex") checkbox in the search dialog, then use some kind of negative lookaround to exclude comments.
In one case, I needed to exclude lines with single-line comments (e.g. # this is a comment) from a search, but not lines with inline comments (e.g. a=b+1 # this is an inline comment). The following did the trick, searching for something (for Python comments, starting with #):
^((?!#).)*something.*$
Please note I'm not a regex-expert, so this regex pattern can probably be improved upon greatly, but it illustrates the idea. You can play around with this on regex101. Any comments to improve the pattern are most welcome, of course.
Note sure if this approach could be extended to multiline comments though. (as in """ several lines here """).
It's difficult to suggest something without really looking at the code, but since it seems like a one-off thing, I would use global search just in comments to temporary replace "it's" with some #temporary-token#, then use global search everywhere, you should everything what's left. Then rollback temporary token for comments. Should be easy to try with VCS. Just an idea.
As you can see, with "Comments only" option, only one #token is found.

Lucene: how can I find query hit positions in original contents?

Suppose I have a document collection that I have indexed in Lucene. I submit a query and get hits. Now what I want is to find where in a particular document hit(s) occur(s). I know that I can use the Lucene Highlighting classes to obtain relevant fragments. But how can I find out where exactly these fragments appear in the original contents?
A related question is how to make sure the found fragments are actually very close to the original query? I noticed in my experiments with highlighting that often I would have a multi-word query and it would return fragments that would have only some of these words. But what if I want to make sure I get hits with all the words?
Thanks!
Not an actual answer, just a few links to a solution to a similar problem.
First of all, here you can see the actual results of the highlighting (note that were is highlighted though am was in the query. Stemming is an additional feature of this implementation):
http://hunglish.hu/search?huSentence=&enSentence=I%20am%20highlighted&size=20&page=2&doc.genre=-10
Here's the source. Look for these methods: highlightField, highlightBisen
http://code.google.com/p/hunglish-webapp/source/browse/trunk/src/main/java/hu/mokk/hunglish/lucene/Searcher.java
Disclaimer: I wrote this a while ago, it is not very nice code, and it is buggy in special cases: there is an open issue relating to highlighting. Furthermore, it uses version 3.2.0 of the lucene-highlighter, which is possibly not the newest.
Anyway, I hope if you look at how it works, it helps you write a better one, or at least something that works as expected.

Intelligent file search for windows that can ignore whitespace and search in code?

Does anybody know a Windows based searching tool that is easy to use and is programmer
friendly.
The functions I am looking for:
Ignore white space in search
= capable to find
myTestFunction ( $parameter, $another_parameter, $yet_another_parameter )
{ doThis();
using the query
myTestFunction($parameter,$another_parameter,$yet_another_parameter){doThis();
without Regexes.
Search code "semantically" (for me, it would have to be PHP):
Search in comments only
Search in function names only
Search for parameters that are named $xyz
Search in (insert code construct here) only
If there is none around, it's high time somebody developed it! :)
I have opened a bounty for this.
See our SD Search Engine. This is a language-sensitive search engine designed to search large code bases, with special language classifiers for C, C++, Java, C#, COBOL, JavaScript, Ada, Python, Ruby and lot of other languages, including your specific target langauge PHP (PHP4 and PHP5).
I think it does everything you requested.
It indexes the language elements so search across large code bases are extremely fast (Linux Kernal ~~ 7.5 Million lines --> 2.5 seconds). (The indexing step runs
on Windows, but the display engine is in Java.)
Search hits are shown in one-line context hit window showing the file and line number, as well as the line with the hit highlighted. Clicks on hits bring up the source code, tabs expanded appropriately, and the line count right even for languages which have odd line counting rules (such as GCC WRT form characters), with the hit line and hit text highlighted. Clicking in the source window will launch your favorite editor on the file.
Because it understands language elements, it ignores language-specific whitespace. It skips over comments unless you insist they be inspected. Searches thus ignore whitespace, comments and lineboundaries (if the language thinks lineboundaries are whitespace, which is why there are langauge-specific scanners). The query language allows you to specify which language tokens you want (specific tokens in quotes, or generic tokens such as identifiers I, numbers N, strings S, operators O and punctuation P) with constraints on the token value as well as a series of tokens.
Your example search:
myTestFunction($parameter,$another_parameter,$yet_another_parameter){doThis();
would be expressed to the search engine precisely as:
I=myTestFunction '(' I ',' I ',' I ')' '{' I=dothis '(' ')' ';'
but it would probably be easier (less typing) to find it as:
I=myTest* ... I=dothis
where I=myTest* means an identifier starting with myTest and ... means "near".
The Search Engine also offer regular expressions searches on the text, if you insist.
So you still have grep-like searches (a lot slower than indexed searches)
but with the hit window and source display windows too.
I use ack really successfully for this kind of thing, particularly when trying to find things in large codebases. I run it linux myself but I don't see any reason why it won't run on windows or in Cygwin at the very least. Check it out, I think you'll find it is exactly what you're looking for.
Search code "semantically" (for me, it would have to be PHP):
For this you could (and I think should) use some custom code using token_get_all()
See also the available tokens
Ignore white space in search
A simple regex should be sufficient. It depends on your regex-library, but most come with a whitespace modifier/flag.
For my Windows desktop search, I use Agent Ransack. I use this as a replacement for the windows search.
You can use regular expressions, but there is a nice entry screen if you want to avoid entering them directly.
Take a look at Google Desktop API, it has very powerful set of methods to do what you're looking for.
Of course it requires you to have the Google Desktop installed.
After reviewing it a little, it provides some functionality but not that specific as what you require.
I really like Crimson Editor and it allows RegEx searches. It has helped me a bunch over the past six years. I think it will fit your needs. Try it.
I use TextPad for searching code files in Windows. It has a very handy find-in-files function (Search / Find In Files) and you can use regex which should meet any search requirements. In the search results it will list the file location, line number and a snippet from that line.

Semantic difference between "Find" and "Search"?

When building an application, is there any meaningful difference between the idea of "Find" vs "Search" ? Do you think of them more or less as synonymous?
I'm asking in terms of labeling for application UI as well as API design.
Finding is the completion of searching.
If you might not succeed in finding something, call the feature "Search". For example text search in an editor can fail due to no matches - then calling it "Find" would be lying.
On the other hand: in an established job searching site, you can say "Find a PHP job" because you know that for (almost) anything your users want, there will be offerings. This also makes it sound confident, positive and energetic.
According to Steve Krug in Don't Make Me Think, when talking about usability for a publicly-facing web site, use the word Search for a search box and nothing else. (He specifically prohibits "Find", "Quick Find", "Quick Search", and all variations.)
The rationale is that "Search" is the most commonly understood term, so it's what people will look for when they aren't thinking, and you don't want your users to have to think (at all).
I would say that "find" is focused on getting a single, exact match. As in the example above, you "find" the perfect PHP job.
OTOH, you "search" for jobs that meet your criteria. Searching is what you do when you want to graze through several results. "Search" returns pages of results. "Find" is closer to "I'm feeling lucky."
Of course, the terms get used interchangeably sometimes. But, I think that's the essence of the difference.
In many applications, find means "find on the current page/screen", while search means "search the entire database/Internet." Web browsers, online help, and other applications seem to make this distinction.
Within most applications...
Find typically refers to locating text within the document at hand and jumps to the next occurrence.
Search typically refers to locating multiple documents (or other objects) and returns a list.
I wrote the built-in Find command in Acrobat 1.0 and worked on the full text Search engine for Acrobat 2.0 and 3.0.
Most software at that point that handled large amounts of text had a way to locate an exact match to a single word or phrase and called it Find/Find Next. This is what we called it in Acrobat 1.0. We knew from the start that this wasn't enough to handle entire repositories of documents, so we needed a way to scan across a whole set. We couldn't use Find since that was already in the UI and had established behavior, so we settled on Search. The decision was based on little more than the relatively small set of common words that convey the action.
Even harder is to come up with a reasonable icon for it. Our initial take was to use something similar to the old Yellow Pages logo:
(source: yellowpagecity.com)
but the lawyers shot that down - it was too close. We couldn't use a magnifying glass as we had zoom functions tied to that. We went with binoculars.
I don't think that there is any difference.
But then again, I'm Portuguese. :P
Find = Discover exact
Example: We write "Please find attached" in an email. We don't write "Please search attached".
Search = Discover exact + Related match
Example: Google Search
"Seek and ye shall find"
"Search and you will find"
One angle that (surprisingly) no one has mentioned, is that in English when you say you search something, that something is the thing you're searching within, not the thing you're trying to find. So unless you add the word 'for' (as in, to search for something), the two words are fundamentally different.
It becomes obvious with an example:
Find the room.
Search the room.
Two very different tasks! The first defines the object of your search. The second defines the scope of your search.
That's not completely irrelevant when talking about UIs. If your app has a search feature where the user can specify both the source and the object of their search, you might choose to use the words this way. For example:
Search: Current document
Find: "positive and energetic"
Yes, as some others have pointed out, the word 'Find' does imply a successful search, but let's not start calling app designers liars for using it when success isn't guaranteed. It's become a pretty standard term for searching a document for a particular string.
I think search is more generic and more suitable for text search. Find sounds more like 'find a specific record or a group of records'
After searching You find something.
Search for an answer on stackoverflow that you may find it.
For me Find is the success of a Search, that is to Find is to identify the location of something that's known to exist.
Search should always be used when you have no control on what the user is looking for.
Find talks about a specific one.
Search does not talk about a specific one.
Did you find the picture I requested yet?
No? Please search on internet. I need to present it in an hour.
Another one is below
Please find the attachment in this email.
(or)
You'll find the attachment below.
(or)
Please find attached.
here, we use find because it is a specific document which is attached to email.
we don't use the search here, as there is nothing to search in a larger domain.
Search is the primary interface to the Web for many users. Search should be global (not scoped to a subsite) and available from every page; booleans should be made intimidating since users usually use them wrong
Read this: https://www.nngroup.com/articles/search-and-you-may-find/

Text indexer search tool which can filter by punctuation?

This is not a programming question per se but a question about searching source code files, which help me in programming.
I use a search tool, X1, which quickly tells me which source code files contain some keywords I am looking for. However it doesn't work well for keywords which have punctuation attached to them. For example, if I search for "show()", X1 shows everything that has "show" in it including the too many results from "MessageBox.Show(.....)" which I don't want to see.
Another example: I need to filter to show ".parent" (notice the dot) and not show everything that has "parent" (no dot) in it.
Anyone knows a text search tool which can filter by keywords that have punctuation? I really prefer a desktop app instead of web based tool like Google (I find it clunky).
I am looking for a tool which indexes words and not a general file searcher like Windows File Explorer.
If you want to search code files efficiently for keywords and punctuation,
consider the SD Source Code Search Engine. It indexes each source langauge according
to langage-specific rules, so it knows exactly the identifiers, keywords,
strings, comments, operators in that langauge and indexes it according to
those elements. It will handle a wide variety of languages: C, C++, Java, VB6, C#, COBOL,
all at once.
Your first query would be posed as:
I=show - I=MessageBox ... '('
(locate identifiers named "show" but eliminate those that are overlapped by
MessageBox leftparen).
You second query would be posed as simply
'.' I=parent
See http://www.semanticdesigns.com/Products/SearchEngine/index.html
It seem to be the job of tools like ctags and cscope.
Ctags is used to index declarations of source files (many languages supported) and Cscope for in-depth c file analysis.
These tools are more suited for a per project use in my opinion. Moreover, you may need to use another tool to use these index, I use vim myself for this purpose, but many text editors use ctags.
The tool from DTSearch.com.

Resources