How to find text strings in a .xxx file

How to find text strings in a .xxx file - string

I'm working on a program that needs to find a tag in a .xxx file to just tell me if it exists or not in the file. I've been doing quite a bit of troubleshooting but I've realized there are three key things I don't know:
What a .xxx file is
Where to find help on how to work with .xxx files (Google didn't return anything useful)
How to read a string out of a .xxx file
I'm looking for help with these 3 things - specifically the 3rd, but help on the other two would mean I don't have to ask more questions later! I'm not in need of troubleshooting help yet - I'm not too worried about making my code run at this moment. This is more for reference and general knowledge so I don't have to ask 100 more questions about tedious specifics later on.
So, if anyone out there knows anything about these three problems, or has any knowledge on .xxx files, can you help me out?
(If you happen to know the code to do this, I'm writing in C#)

If you're using ReadLines, then it assumes it's a text file with line endings. If you're trying to use that on a binary file, then it won't necessarily work. And the best you may get is a count of 0 or 1, if there's no line endings found in the binary file at all.
You'll have to load the bytes in that instances and do a more thorough search through the binary file for instances of your string.
But if you're only wanting to know if a LINE contains at least one instance (as you have written your code above), then it won't work for binary files where you can't guarantee line endings exist.

Related

Why do some files appear as partial gibberish when opened in a text editor?

I often come across the situation where I would like to read a file's original content in a human-readable way. When opening this kind of file in a text editor, why is it that it is usually gibberish with some complete and comprehensible text ? I would think that if the file is converted to something other than it's original written format, that there would be no comprehensible text remaining, yet I often find it is somewhere in between.
For example, I know that if I open a binary in a text format, there will be nothing comprehensible left that isn't purely accidental.
Example screencapture of partial gibberish text
Why is there complete text in here mixed with gibberish? Does that mean if I open the file with some sort of different encoding (I don't know what's possible), the file will come through as fully readable text? I would understand if it were all-or-nothing (either gibberish-non-readable OR human language) but I don't understand the in-between.
Please provide educational responses, rather than "because that's the way it is" type answers.

Those are formatting characters; there is no standard use and vary by the format of the file in question. You can still extract the text as needed with a fair knowledge of grep and regex, but it won't be fun. The best bet is to open the file with the software that can read it properly, as a text editor like gedit or Notepad++ will read the raw data and display that. Adobe's pdf format has text embedded, for instance, and all that gibberish is instructions for the Reader software for displaying it correctly on the screen while still allowing for relatively straightforward text extraction when required.
Editors have no real way to interpret the special formatting characters, and would need to be loaded with APIs for every conceivable program. They would also need to be updated constantly, since the formatting changes regularly for a variety of reasons. Many times, it is just to keep the files from being backward compatible with their own or other products, forcing an upgrade path. Microsoft is rather famous for that, but they are by far not the only company to do so.

How can I open a .cat file?

I'm looking for a way to open a .cat file. I have not a single clue about how to do it (I've tried with the notepad and sublime text, without results), the only thing I know is that it's not corrupted (it's read by another program, but I need to see it with my eyes to understand the structure of the content and create a similar one for my purposes).
Every hint is well accepted.

If you can't make sense of it in a standard text editor, it's probably a binary format.
If so, you need to get yourself a program capable of doing hex dumps (such as od) and prepare for some detailed analysis.
A good start would be trying to find information about Advanced Disk Catalog somewhere on the web, assuming that's what it is.

Picking Certain Documentation with DOXYGEN

I would like to achieve the almost exact opposite of what can be
performed with command \internal. There exists a huge doxygen
documentation for a project already, but now I would like to pick out
a few blocks (functions, constants etc.) to create a very small manual
only containing the important stuff.
Instead of marking 99% of the comments as \internal it would be nice
to have a command like \external for the 1% of comments that need to
be exported in my case.
Something like disabling the "default section" (everything, which is
not part of a section) would work too, of course. Then I could use
ENABLED_SECTIONS...
Unfortunately the comments in question do not reside in one file only.
Furthermore those files contain a lot of other comments, which should
not be exported.
I already thought to move those comments into separate header files
that could be included in the original position, but this would mean
to restructure a lot and tearing files apart.
Does anybody have an idea how to solve my problem?
Thanks in advance,
Nico

I think ENABLED_SECTIONS is the way forward, but there's a couple of things that might reduce the workload.
The first is to create a separate doxyfile for your particular requirement, then you can customise that without upsetting any master one.
In that new doxyfile explicitly list, in the INPUT file list, only those files that contain content that you need. Chances are that it's currently set to pull in whole folder trees - edit that to cherry pick the individual files; not forgetting files that you may need to define the 'structure' of the document.
After that use ENABLED_SECTIONS with corresponding #if <SECTION_NAME> #endif markers to refine the selection to units smaller than a file.

filter lines starting with ; ussing batch

Hi I have a script (auto lisp AutoCAD) for a program. The rules of this script are that comments are started with ; character is it possible to write a batch that filters out all lines starting with ;. I namely then encrypt the file from a LSP to a FAS type which renders the commentary as useless (cant be read when encrypted) however AutoCAD still encrypts the text meaning a fairly heavy file size (double of what it should be). The current method is to manually delete every comment line by hand however try doing that a few hundred times. And I need the commentary in place to keep neat record of what’s happening because I work from the not encrypted lisp file its self.
All in all I also want the encryption because its my hard work and my right to keep this secure as it then also means more job security, it also allows me to block some smart alec self proclaimed staff making edditation and in edition the file encryption is recommended for stability reasons by AutoCAD its self.
All in all even if it was because I like to without good reason then that should be valid enough.
I’m looking to achieve this through a batch script as that one of few languages that I feel competent enough in… outside of the AutoCAD frame.

The following will convert a file named "source.lsp" and produce "noComment.lsp". It will strip out lines that start with a ; (including comment lines indented with spaces).
findstr /rvc:"^ *;" "source.lsp" >"noComment.lsp"

How to read excel(2007+ xlsx) sheet using actionscript(AIR)?

How to read excel(2007+ xlsx) sheet using actionscript(AIR)?

as3xls
An Actionscript 3 library for reading and writing Excel files. Currently reading numbers, text, and formulas from Excel version 2.0-2003 and writing numbers, text, and dates to Excel 2.0 is supported. No server-side help is needed.
SUPPORT INFORMATION
Documentation and samples are at http://code.google.com/p/as3xls/

I wrote this: https://github.com/childoftv/as3-xlsx-reader I'd love to know if it helps

Do you have any idea how... Inefficient this is?
Excel uses a complex setup for files, and unless you want to write a full-scale parser for its spreadsheets (which, believe me, will be difficult, alone to figure out what the format chars do), you'd be better off finding another solution.
Say, using a "save to XML" option would make your job a few thousand times easier, without exaggeration. AS3 has no native support for Excel, there is no real point for it to have such. But it has great integrated methods for working with XML.
If possible, save the Excel files to XML and parse those.
Better still, use databases, and parse them as XML through PHP.
I did a search and came up with this: http://code.google.com/p/php-excel-reader/
Once you've got it in PHP, passing it on to Flash is no problem at all. I'd recommend turning it into straight arrays of objects and converting it to AMF3 via Zend_Amf, AMFPHP or WebOrb, whichever one you're most comfortable with. You can then create tables, manipulate the data or whatever you like. It'd also be a lot faster and lighter than using XML.
PK

I took a look at the xlsx breakdown and it would take me 1 week to write an xlsx writer that could do basic formatting and formulas. I've only spent 1 hour perusing through the directories in an xlsx file and all you'd have to do is create the same directory structure...mostly cut and paste some strings..and then zip it and call it xlsx.
I tried this theory by manually making an xlsx file using 7zip. I downloaded childoftv's reader and, though I don't need the reader, the package includes a few zip/unzip classes that would prove helpful for anyone who wants to make a xlsx writer.
Long story short, the setup isn't complex, somebody just has to take a week out of their busy schedule to do it. I need this functionality so if nobody's done it yet, then I'll have to. Hopefully my search will find something better than a forum where the general consensus is "it's too hard, give up."

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to find text strings in a .xxx file - string

Related

Why do some files appear as partial gibberish when opened in a text editor?

How can I open a .cat file?

Picking Certain Documentation with DOXYGEN

filter lines starting with ; ussing batch

How to read excel(2007+ xlsx) sheet using actionscript(AIR)?

Categories

Resources