Hooking Kernel sys_read() Not Affecting Text Editors - linux

So, I've been doing a little kernel module programming and I have a working module installed that screens text files with a certain name and replaces any occurrence of a word with another. I do this by tracking the files I want in the module through hooking sys_open() and then I do the rewrite in my hook of sys_read().
However, the effect is only seen when I cat the file (or maybe use awk or print from bash), but opening the screened file in a text editor just displays the unfiltered text.
My question is, why doesn't hooking sys_read() affect the output of text editors? I have tried: vi, vim, gedit and nano. Are they obtaining the file contents in another way? I know it is calling sys_read() because my printk debug message appears in dmesg, but maybe it is throwing away the read buffer and using another technique?
Just wondering what is happening.

It could be that your text editor is using a separate temporary file when you are editing the file and only saving it under the real name while closing/exiting the editor.

Related

In vim, how can I write a line to top of buffer on open, and remove that line on save?

I am using vim as my text editor for programming in react/javascript, and trying to use flow for static type checking. In my .flowconfig, I have all=true, meaning I would like to have flow type checking on all .js files without requiring a comment at the top of each file. I have the vim plugin ale setup to lint flow and give errors, and it works fine when I have the flow comment (// #flow) at the top of the file, but I would like this to work on every .js file without needing the comment at the top (as specified in my .flowconfig).
Afters spending some time tinkering with ale, I thought my best option may be to simply prepend the comment at the top of the file when a new buffer is opened, and then remove it before saving. Is there a way to do this in my .vimrc with an autocommand?
Bonus: Extra brownie points if you can get this to work using only ale and not needing to add the flow comment to every file

using hard link with kate editor

I have problem in working with link command in linux mint.
I made file1 and add a new hard link to that file :
link file1 file2
I know when I change the contents of file1 , file2 should change too.
and when I edit file1 with vim or add text to it with redirections it works well but when
I edit file1 with kate editor then it's like that the editor break the link of file2! and after that when
I change the contents of file1 with kate or vim,... file 2 will never change any more.
what's the problem?
I'm one of the Kate developers. The issue is as follows: Whenever Kate saves, it does so by saving into a temporary file in the same folder, and on success just does a move to the desired location.
This move operation is exactly what destroys your hard link: first, the hard link is removed, then the temporary file is renamed.
While this avoids data loss, it also has its own problems as you experience. We are tracking this bug here:
https://bugs.kde.org/show_bug.cgi?id=358457 - QSaveFile: Kate removes a hard link to file when opening a file with several hard links
And in addition, QSaveFile also has two further issues, tracked here:
https://bugs.kde.org/show_bug.cgi?id=333577 - QSaveFile: kate changes file owner
https://bugs.kde.org/show_bug.cgi?id=354405 - QSaveFile: Files are unlinked upon save
The solution would be to just directly write in all these corner cases, then we could avoid this trouble at the expense of loosing data in case of a system crash, so it's a tradeoff. To fix this, we need to patch Qt, which noone did so far.
Different programs save files in different ways. At least two come to mind:
open existing file and overwrite its content
create temporary file, write new content there, then somehow replace original file with new one (remove old file and rename new one; or rename old file, rename new file, then remove old file; or use system function to swap files content (in fact, swap names of files), then remove old file; or ...)
Judging from its current source code, Kate is using the latter approach (using QSaveFile in the end, with direct write fallback though). Why? Usually, to make file saving more or less atomic, and also to make sure that saving won't result in errors caused by e.g. lack of free space (although this also means that it uses the space of sum of old and new file sizes when saving).
I don't have Kate on Linux Mint but I have noticed issues which lead me to suspect you may have come across a "bug".
Here are two 'similar' issues with hard links that have been reported.
Bug 316240 - KSaveFile: Kate/kwrite makes a copy of a file while editing a hard link
"Hard link not supported" with NTFS on Linux #3241

filetype and corresponding program that will honor control characters

I am using Node.js to write to log files, using the colors module which I believe inserts control characters into strings, for coloring/text formatting which will display in a terminal application.
When I write to the terminal directly, it shows colors, but when I write to a .log file and then tail the log file with either Terminal.app or iterm2, it does not show colors/text formatting. Does anybody know why this is? My guess is that when you write to the log file the control characters don't get saved? In that way, when tailing they won't display at all?
Perhaps if I write to .txt file or some other type of file, the control characters will remain?
How does this work exactly? At some point the control characters are getting stripped or ignored and I am not sure how or when.
See this code.
It checks if the output is going to a terminal (by checking process.stdout.isTTY) or to somewhere else, like a file. If the latter, no color codes are outputted.

Why file is accessible after deleting in unix?

I thought about a concurrency issue (in Solaris), what happen if while reading someone tries to delete the same file. I have a query regarding file existence in the Solaris/Linux. suppose I have a file test.txt, I have open it in vi editor, and then I have open a duplicate session and remove that file, but even after deleting that file I am able to read that file. so here are my questions:
Do I need to thinks about any locking mechanism while reading, so no one able to delete same file while reading.
What is the reason of showing different behavior from windows(like in windows if file is open in in some editor than we can not delete that file)
After removing that file, how I am still able to read that file, if I haven't closed file from vi editor.
I am asking files in general,but yes platform specific i.e. unix. what will happen if I am using a java program (buffer reader) for read file and file is deleted while reading, does buffer reader still able to read the file for next chunk or not?
You have basically 2 or 3 unrelated questions there. Text editors like to read the whole file into memory at the start of the editing session. Imagine every character you type being saved to disk immediately, with all characters after it in the file being rewritten one place further along to make room. That would be awful. Much better that the thing you're actually editing is a memory representation of the file (array of pointers to lines, probably with some metadata attached) which only gets converted back into a linear stream when you explicitly save.
Any relatively recent version of vim will notify you if the file you are editing is deleted from its original location with the message
E211: File "filename" no longer available
This warning is not just for unix. gvim on Windows will give it to you if you delete the file being edited. It serves as a reminder that you need to save the version you're working on before you exit, if you don't want the file to be gone.
(Note: the warning doesn't appear instantly - vim only checks for the original file's existence when you bring it back into the foreground after having switched away from it.)
So that's question 1, the behavior of text editors - there's no reason for them to keep the file open for the whole session because they aren't actually using it except at startup and during a save operation.
Question 2, why do some Windows editors keep the file open and locked - I don't know, Windows people are nuts.
Question 3, the one that's actually about unix, why do open files stay accessible after they're deleted - this is the most interesting one. The answer, guaranteed to shock you when presented directly:
There is no command, function, syscall, or any other method which actually requests deletion of a file.
Underlying rm and any other command that may appear to delete a file there is the system call unlink. And it's called unlink, not remove or deletefile or anything similar, because it doesn't remove a file. It removes a link (a.k.a. directory entry) which is an association between a file and a name in a directory. (Note: ANSI C added remove as a more generic function to appease non-unix people who had no intention of implementing unix filesystem semantics, but on unix, remove is just a rmdir if the target is a directory, and unlink for everything else.)
A file can have multiple links (see the ln command for how they are created), which means that the same file is known by multiple names. If you rm one of them, the others stick around and the file is not deleted. What happens when you remove the last link? Well, now you have a file with no name. But names are only one kind of reference to a file. There are at least 2 others: file descriptors and mmap regions. When the last reference to a file goes away, that's when the file is deleted.
Since references come in several forms, there are many kinds of events that can cause a file to be deleted. Here are some examples:
unlink (rm, etc.)
close file descriptor
dup2 (can implicitly closes a file descriptor before replacing it with a copy of a different file descriptor)
exec (can cause file descriptors to be closed via close-on-exec flag)
munmap (unmap memory region)
mmap (if you create a new memory map at an address that's already mapped, the old mapping is unmapped)
process death (which closes all file descriptors and unmaps all memory mappings of the process)
normal exit
fatal signal generated by the kernel (^C, segfault)
fatal signal sent from another process (kill)
I won't call that a complete list. And I don't encourage anyone to try to build a complete list. Just know that rm is "remove name", not "remove file", and files go away as soon as they're not in use.
If you want to destroy the contents of a file immediately, truncate it. All processes already using it will find that its size has suddenly become 0. (This is destruction as far as the normal file access methods are concerned. To destroy it more thoroughly so that even someone with raw disk access can't read what used to be there, you need to overwrite it. There's a tool called shred for that.)
I think your question has nothing to do with the difference between Windows/Linux. It's about how VI works.
when using VI to edit a file. VI will create a .swp file. And the .swp file is what you are actually editing. At the same time, if other users delete the original file will not effect your editing.
And when you type :w in VI, VI will use .swp file to overwrite the original file.

I want to change the way text is represented internally in ANY Text Editor

I want to use a algorithm to reduce memory used to save the particular text file.I don't really know how text is stored but i have an idea in mind.
Would it be better to extend a open source text editor (if yes than which one) or write a text editor myself.
It would be nice if someone could also give me a link or tutorial to some basics on how text editors work and the way data is stored.
Edited to add
To clarify, what I wanted to do is instead of saving duplicates of a word make a hash table and store the address where it needs to be placed.
That way I wouldn't be storing the duplicates.
This would have become specific to a particular text editor.
Update
thanks everyone I got what all of you'll are trying to say. Anyways all i wanted to do is instead of saving duplicates of a word make a hash table and store the address where it needs to be placed.
This was i wouldn't be storing the duplicates.
Yes and this would have become specific to a particular text editor. never realized that.
I want to use a algorithm to reduce memory used to save the particular text file
If you did this you would no longer have a text editor, but instead you would have created some sort of binary file editor.
The whole point of the text file format is that it is universal, meaning any text file can be open in any other text editor.
Emacs handles compression transparently. Just create a text file with .gz extension. Emacs will automatically compress contents of the file during save operation, and decompress when you open the file next time.
Text is basically stored as-is. i.e., every character takes up a byte or two (wide chars), and there is no conversion done on it when it's saved. It might add an end-of-file character or something though. Don't try coming up with your own algorithm to compress these files. That's why zip-files and other archives were created. They're really good at compressing text. If you wanted to add these feature to your text-editor, you'd have to add some sort of post-save hook to zip it, and then put a hook on the open command to unzip it. Unless you wanted to do it by hand every time. Don't try writing the text editor yourself from scratch, unless (maybe) you're writing notepad. Text editors with syntax highlighting aren't very easy to make, even with the proper libraries. I'd say write a plugin for something like Visual Studio or what have you. Or find an open-source text editor.

Resources