Inserting text into a text file - linux

Alright, this maybe the simplest (or the stupidest) question, but I just got to know...
Suppose I have a text file containing account no. and balance. I am writing a program to search the file using entered account no. and update the balance field in the same text file with a new balance.
I am finding it surprisingly difficult to do so using file streams. The problem is that I am trying to overwrite the balance string in the said text file with a new balance string.
So, if the balance is 1000 (4 digits), I can overwrite it with another 4 digit string. But, if the new balance string is more than 4 digits, it is overwriting the data after the balance field (it is a simple text file mind you...).
For example, if the text file contains
Acc. No. balance
123456 100
123567 2500
The fields are separated by TAB '\t' character, and next record is separated by a newline '\n'. If I enter new deposit of 200000 for account 123456, the fwrite() function overwrites the data in the text file as...
Acc. No Balance
123456 2001003567 2500
You can notice that the '\n' after the balance field, and 2 digits from the next accounts' acc. no. is overwritten.
Of course, no-one wants that to happen :)
What I need is a way to insert text in that file, not just overwrite it. There are many results of doing this using Java, python or even SED, but nothing using FILE streams.
Please share your thoughts... thanks.

You'll have to move all data after the insertion point a few bytes up first. That's what Java, sed or python do as well, if they aren't writing a temporary file to begin with.

If you really want to manage your data in a plain text file:
While you are reading the file in, write a modified version of your data into a temporary file, then delete the original file and rename the temp file to the original filename. But be carefull that no other process accesses the same file concurrently.
Database systems were invented for such purposes. So I recommend to manage your data in a database table and dynamically create a text report when needed.

Related

How to bring text qualifier back to an exported flat file

We designed an SSIS package for our client to export data from their databases to flat files that need to be further processed by us.
In short, one of the flat file destination in DFT does not have text qualifier specified. These days we received files with lots data that has the same symbol as the column delimiter in the text. And the appearance is unpredictable, meaning it could show up in any of the column.
Before sending the updated package to the client, is there any other way (No hard coded update from the backend, such as update each column with the column from the right) to know where the original column ends between each?
The easiest way is ask them to export the files again with Text Qualifier to escape the symbol, but for business concern, it might not be the our top1 pick, any one experienced this before, any advices and suggestions?

Easy csv to excel

My customer has an issue with certain .csv files auto detecting data types and altering data when they open in excel. Current workaround is to open an instance of excel, open the file, and go through the many-step process of choosing data types.
There is no standard format for which data elements will be in each csv file, so I've been thinking up methods to write code that is fairly flexible. To keep this short, basically, I think I've got a good idea of how to make something flexible to support the customer's needs that involves running an append query in Access to dynamically alter/create specifications, but I cannot figure out how to obtain values for the "Start" and "Width" fields in the MSysIMEXColumns table.
Is there a function in vba that can help me read a csv file, and gather the column names along with the "Start" and "Width" values? Bonus if you can also help me plug those values into an Access table. Thanks for your help!!
First of all... there is NO "easy csv to Excel" conversion when your customer has:
"...no standard format for which data elements will be in each csv file."
I used to work for a data processor where we combined thousands of different customer files, trying to plunger them into a structured database. The ways customers can mangle data are endless. And just when you think you've figured them out, they find news ways of mangling data.
I once had one customer who had the brilliant idea of storing their "Dead Beat" flag IN their Full name field. And then didn't tell us they did so. And then when we mailed the list out to their customers, they tried to blame us for not catching that. Can you imagine someone waking up some morning and get junk mail addressed to "Dear, Dead Beat"?
But that's only one way "no standard format" customers can make it impossible to catch their errors. They can be notorious for mixing in text with number fields. They can be notorious for including invisible escape characters in text fields that make printers crash. And don't even get started on how many different ways abbreviations can cause data to be inconsistent.
Anyway... to answer your question:
If you are using CSV files, they are comma delimited. You don't need "Start" and "Width".
"Start" and "Width" are for Fixed Width files. And if your customer is giving you a fixed width file, they NEED to give you a "standard format". If they don't then you are just trying to mind read what they did. And while you can probably guess correctly most of the time, inevitably, you are going to guess wrong and your customer is going to try to blame you for the error.
Other than that, sometimes you just have to go through the long slog of having a human visually inspect things to make sure the convert went as planned. I'd also suggest lots of counts and groupings on your data afterwards to make sure they didn't do something unexpected.
Trying to convert undocumented files is a very difficult and time consuming task. It's why they are paying you big bucks to do it.
So to answer your question again, "Start" and "Width" are for Fixed Width files. If they are sending you Fixed Width files, they need to send specifications.
If it's a csv file, you don't need "Start" and "Width". The delimiter (ususally a comma) is what separates your fields.
** EDIT **
Ok... thinking through this some more... I'll take a guess at what you are doing:
1) You create and save a generic spec in Access for delimited files.
2) You open your CSV file through vba and read the new delimited header record with all the column header names.
3) You try to modify the MSysIMEXColumns table to include new fields and modify old ones.
4) You now run your import based on the new spec you created.
If that is the case, you need to do a couple of things:
First, understand that this a dangerous thing to do. Access uses wizards to create it's systems tables. If you muck with these, you don't know how it might affect the wizards when they try to access these tables again. You are best off creating a new spec for each new file type, using the Access wizards.
Second, once you come to the conclusion you are smarter than microsoft (which is probably a good bet anyway), you can try to make a dynamic spec file.
Third, you NEED to make sure your spec record in MSysIMEXSpecs is correct. That means you need to have it set as a delimited file and have the correct delimiter in there. Plus you need to have the FileType correct. You need to know if it's Unicode or any number of other file types that your customer could be sending you.
And by "correct delimiter" I mean... try to get your customer to send you "pipe delimited" | files. If they send you "comma delimited" files, you run the risk of them sending you text fields with comments or addresses that include a comma in the data. Say they cut and paste a street address that has a comma in it... that has the fantastic effect of splitting that address into two fields and pushing ALL of your subsequent columns off by one. It's lots of work to figure out if your data is corrupted this way. Pipes are much less likely to be populated in your data by accident.
Fourth, assuming your MSysIMEXSpecs is correct, you can then modify your MSysIMEXColumns table. You will need to extract your column headers from your csv file. You'll need to know how many fields there are and their order. You can then modify your current records to have the new field names and add any new records for new fields, or delete records if there are less fields than before.
I'd suggest saving them all as text fields DataType=10 in a staging table. That way you can go back and do analysis on each field to see if they mixed text into numeric fields or any other kind of craziness that customers love to do.
And since you know your spec in MSysIMEXSpecs is a delimited field file, you can give each record a "Start" field equal to the sequence your Header record calls for.
Attributes DataType FieldName IndexType SkipColumn SpecID Start Width
0 10 Rule 0 0 3 1 1
0 10 From Bin 0 0 3 2 1
0 10 toBin 0 0 3 3 1
0 10 zone 0 0 3 4 1
0 10 binType 0 0 3 5 1
0 10 Warehouse 0 0 3 6 1
0 10 comment 0 0 3 7 1
Thus, the first field will have a "Start" of 1. The second field will have a "Start" of 2. etc.
Then your "Width" fields will all have a length of 1. Since your file is a delimited file, Access will figure out the correct width when it does the import.
And as long as your SpecID is pointing to the correct delimited spec, you should be able to import any csv file.
Fifth, after the data is in your staging table, you should then do an analysis of each field to make sure you know what type of data you really have and that your suspected data type doesn't violate any data type rules. At that point you can transfer them to a second staging table where you convert to the correct data types. You can do this all through VBA. You'll probably need to create quite a few home grown functions to validate your data. (This is the NOT so easy part about "easy csv to Excel" coding.)
Sixth, after you are done all of your data massaging, you can now transfer the good data to your live database or Excel spreadsheet. Inevitably you'll always have some records that don't fit the rules and you'll have to have someone eyeball the data and figure out what to do with it.
You are at the very tip of the iceberg in data conversion management. You will soon learn about all kinds of amazingly horrible stuff. And it can take years to write automated procedures that catch all the craziness that data processors / customers send to you. There are a gazillion data mediums that they can send the data to you in. There are a gazillion different data types the data can be represented in. Along with a a gazillion different formats that the data resides in. And that's all before you even get to think about data integrity.
You'll probably become very acquainted with GIGO (Garbage In, Garbage Out). If your customer tries to slip a new format past you (and they will) without telling you the "standard format for which data elements will be", you will be left trying to guess what the data is. And if it's garbage... best of luck trying to create an automated system for that.
Anyway, I hope the MSysIMEXColumns info helps. And if they ever give you Fixed Length files, just know you'll have to write a whole new system to get those into your database.
Best of luck :)

How do I create a fixed width text file?

I have a fixed width text file that I needed to edit about 200 rows of. Importing into excel is easy but when I have completed my edits and try to save the file as a space-delimited or text file all the spacing goes out of whack, i.e the first field in excel is padded out to 6 characters but when I save the file as space-delimited or text it then turns that field into 8 characters.
Please note that I'm using a LEFT(text&REPT(" ", 30)30) formula to get the required padding which works very nicely. However I can't seem to save the file with the correct number of spaces. I have also just tried copying and pasting into a notepad file but this seems to just create more unwanted spaces etc.
How do I create a fixed width file when I have all the data I need and the field length requirements?? Has anyone had this trouble before? Thanks in advance.
I agree with Gary's Student. Just go to: Save As -> Formatted Text (Space Delimited) (.prn)*.This will bring almost same functionality as one you have in Excel.
For more information you may refer to:
https://superuser.com/questions/100433/export-an-excel-spreadsheet-to-fixed-width-text-file
I found that the best way to do this was to use Access and save as a text file and then you can set your own field width and export. excellent!
I'll suggest an export to csv (or similar) and then convert it with UltraEdit's super simple "Convert CSV to fixed with" function.
It scans the file and suggests a column width based on you content.
You can easily define your own preferred column width in a 30,25,25,40 pattern.
There's a 30 day trial, if you like it it's well worth the $99 license...

save datawindow as text in powerbuilder with some additional text

***Process Date From:
01/05/2012 0:00
Group;Member
Status:****
Rcp Cd Health Num Rcp Name Rcp Dob
1042231 1 MARIA TOVAR DIAS 14-Feb-05
1042256 2 KHALID KHAN 04-Mar-70
1042257 3 SAMREEN ISMAT 25-Mar-80
1042257 5 SAMREEN ISMAT 25-Mar-80
1042257 4 SAMREEN ISMAT 25-Mar-80
I want my Powerbuilder datawindow Save As text look like this Bold text are the additional text want to add and rest is the current save as text result.
Text files cannot contain formatting. There's no way to get bold text in a plain text file. I suggest adding the text to your datawindow header band (bolded, with an expression to make sure it only displays on the first page), then saving the results as HTML.
Well, you didn't mention which version of PB you are using, so I'll assume a recent one in which case you have some better options such as SaveAsAscii and/or SaveAsFormattedText which offer more flexibility in displaying column headers, computed fields, etc.
If you want to add the top section, I would add one or more additional dummy columns (or computed fields) to your dataobject for the additional data. Then either populate the dummy columns manually after retrieve, or via expression in computed field. You could put all of it in one computed field that wraps, or use four different ones (e.g. process_date_label, process_datetime, group_status, status).
The two newer versions of SaveAs will work better for you as they display column header values instead of the column header name. SaveAsAscii came out pretty early somewhere around version 7 of PowerBuilder. SaveAsFormattedText is relatively new and came out somewhere around PB version 11 and it is a lot like SaveAsAscii but it lets you choose file encoding.
If you need more explicit detail let me know but I am sure you can get something to work using SaveAsAscii and extra columns.
Pseudo code
Do the SaveAs to temp file
Open the temp file for read in line mode
Open output file for write (replace) in line mode
Write your additional text lines to the output file (note: you can include CRLF to
write multiple lines at once)
Loop:
Read line from temp file
If EOF exit loop. Note: 0 is not EOF, -100 is EOF
Write line to output file
Close temp file, output file
Delete temp file

CSV Exporting: Preserving leading zeros

I'm working on a .NET application which exports CSV files to open in Excel and I'm having a problem with preserving leading zeros when the file is opened in Excel. I've used the method mentioned at http://creativyst.com/Doc/Articles/CSV/CSV01.htm#CSVAndExcel
This works great until the user decides to save the CSV file within Excel. If the file is opened again in Excel then the leading zeros are lost.
Is there anything I can do when generating the CSV file to prevent this from happening.
This is not a CSV issue.
This is Excel loving to play with CSV files.
Change the extension to something else.
As #GSerg mentions, this is not a CSV issue.
If your users must edit/save in Excel they need to select the entire worksheet, right-click and choose "Format Cells" and from the Category list select "Text" after opening the csv file. This will preserve the leading zeros since the numbers will be treated as simple text.
Alternatively, you could use Open XML SDK 2.0, or some other Excel library, to create an xlsx file from your csv data and programmaticaly set the Cell type to Text in order to take the end users out of the equation...
I found a nice way around this, if you add a space anywhere along the phone number, the cell is then not treated as number and is treated as a text cell in both Excel and Apple's iWork Numbers.
It's the only solution I've found so far that plays nice with Numbers.
Yes I realise the number then has a space, but this is easy to process out of large chunks of data, you just have to select a column and remove all spaces.
Also, if this is web related, most web type things are ok with users entering a space in the number field. E.g you can tap-to-call on mobiles.
The challenge is to get the space in there in the first place.
In use:
01202123456 = 1202123456
but
01202 123456 = 01202 123456
Ok, new discovery.
Using Quick Preview on Mac to view a CSV file the telephone column will display perfectly, but opening the file fully with Numbers or Excel will ruin that column.
On some level Mac OS X is capable of handling that column correctly with no user meddling.
I am now working on the best/easiest way to make a website output a universally accepted CSV with telephone numbers preserved.
But maybe with that info someone else has an idea on how to make Numbers handle the file in the same way that Quick Preview does?

Resources