struct smt{
char *c;
};
int main(){
char *w="astring";
if(smt->c == w[0])
...do something
}
How do I fix the warning that I get in the if and what exacly causes it?
The warning shows up because you're comparing smt->c, which is char*, to w[0], which is a character (that for this comparison gets implicitly casted to int).
You probably meant comparing the first character like this:
if(smt->c[0] == w[0]) { ... }
If you want to compare full strings, use
if(strcmp(smt->c, w) == 0) { ... }
or even better, use strncmp if you know the maximum length the strings can have.
The error comes from the fact that often (almost always), you don't want to compare an adress (pointer) with a character.
You're comparing a char* c with a char 'a'. What you want to do is this I believe:
struct smt{
char *c;
};
int main(){
char *w="astring";
// Here smt->c returns a char*
// w[0] gets you the first character, so 'a'
if(strcmp(smt->c, w) == 0)
...do something
}
If you want to compare the first characters of both strings, you have to add [0] to smt->c
I'm trying to run this C++ code in Xcode 8.1:
std::string str = "g[+g]g[-g]g[−g[+g]g]g[+g]g";
for (auto& c : str) {
printf("%c", c);
}
and I'm getting this as output:
g[+g]g[-g]g[\342\210\222g[+g]g]g[+g]g
Does anyone knows why some characters are coming as hexadecimal characters?
I already tried to print as c_str().
I want to know if is there any way to convert a unicode code to a string or char in C++ 11.
I've been trying with extended latin unicode letter Á (as an example) which has this codification:
letter: Á
Unicode: 0x00C1
UTF8 literal: \xc3\x81
I've been able to do so if it's hardcoded as:
const char* c = u8"\u00C1";
But if i got the byte sequence as a short, how can I do the equivalent to get the char* or std::string 'Á'?
EDIT, SOLUTION:
I was finally able to do so, here is the solution if anyone needs it:
std::wstring ws;
for(short input : inputList)
{
wchar_t wc(input);
ws += wc;
}
std::wstring_convert<std::codecvt_utf8<wchar_t>> cv;
str = cv.to_bytes(ws);
Thanks for the comments they were very helpful.
The C++11 standard contains codecvt_utf8, which converts between some internal character type (try char16_t if your compiler has it, otherwise wchar_t) and UTF-8 encoding.
The problems is that char is only one byte length, while unicode characters require a size of two bytes.
You can still treat it as char*, but you must remember that you are not dealing with an ascii string (there will be zeros).
You may have to switch to wchar_t.
I need to generate a fixed width file with few of the columns in packed decimal format and few of the columns in normal number format. I was able to generate. I zipped the file and passed it on to the mainframe team. They imported it and unzipped the file and converted to EBCDIC. They were able to get the packed decimal columns without any problem but the normal number fields seemed to have messed up and are unreadable. Is there something specific that I need to do while process/zip my file before sending it to mainframe? I am using COMP3 packed decimal. Currently working on Windows XP but the real production will be on RHEL.
Thanks in advance for helping me out. This is urgent.
Edited on 06 June 2011:
This is how it looks when I switch on HEX.
. . . . . . . . . . A . .
333333333326004444
210003166750C0000
The 'A' in the first row has a slight accent so it is not the actual upper case A.
210003166 is the raw decimal. The value of the packed decimal before comp3 conversion is 000000002765000 (we can ignore the leading zeroes if required).
UPDATE 2 : 7th June 2011
This how I am converting creating the file that gets loaded into the mainframe:
File contains two columns - Identification number & amount. Identification number doesn't require comp3 conversion and amount requires comp3 conversion. Comp3 conversion is performed at oracle sql end. Here is the query for performing the conversion:
Select nvl(IDENTIFIER,' ') as IDENTIFIER, nvl(utl_raw.cast_to_varchar2(comp3.convert(to_number(AMOUNT))),'0') as AMOUNT from TABLEX where IDENTIFIER = 123456789
After executing the query, I do the following in Java:
String query = "Select nvl(IDENTIFIER,' ') as IDENTIFIER, nvl(utl_raw.cast_to_varchar2(comp3.convert(to_number(AMOUNT))),'0') as AMOUNT from TABLEX where IDENTIFIER = 210003166"; // this is the select query with COMP3 conversion
ResultSet rs = getConnection().createStatement().executeQuery(sb.toString());
sb.delete(0, sb.length()-1);
StringBuffer appendedValue = new StringBuffer (200000);
while(rs.next()){
appendedValue.append(rs.getString("IDENTIFIER"))
.append(rs.getString("AMOUNT"));
}
File toWriteFile = new File("C:/transformedFile.txt");
FileWriter writer = new FileWriter(toWriteFile, true);
writer.write(appendedValue.toString());
//writer.write(System.getProperty(ComponentConstants.LINE_SEPERATOR));
writer.flush();
appendedValue.delete(0, appendedValue.length() -1);
The text file thus generated is manually zipped by a winzip tool and provided to the mainframe team. Mainframe team loads the file into mainframe and browses the file with HEXON.
Now, coming to the conversion of the upper four bits of the zoned decimal, should I be doing it before righting it to the file? Or am I to apply the flipping at the mainframe end? For now, I have done the flipping at java end with the following code:
public static String toZoned(String num) {
if (num == null) {
return "";
}
String ret = num.trim();
if (num.equals("") || num.equals("-") || num.equals("+")) {
// throw ...
return "";
}
char lastChar = ret.substring(ret.length() - 1).charAt(0);
//System.out.print(ret + " Char - " + lastChar);
if (lastChar < '0' || lastChar > '9') {
} else if (num.startsWith("-")) {
if (lastChar == '0') {
lastChar = '}';
} else {
lastChar = (char) (lastChar + negativeDiff);
}
ret = ret.substring(1, ret.length() - 1) + lastChar;
} else {
if (num.startsWith("+")) {
ret = ret.substring(1);
}
if (lastChar == '0') {
lastChar = '{';
} else {
lastChar = (char) (lastChar + positiveDiff);
}
ret = ret.substring(0, ret.length() - 1) + lastChar;
}
//System.out.print(" - " + lastChar);
//System.out.println(" -> " + ret);
return ret;
}
The identifier becomes 21000316F at the java end and that is what gets written to the file. I have passed on the file to mainframe team and awaiting the output with HEXON. Do let me know if I am missing something. Thanks.
UPDATE 3: 9th Jun 2011
Ok I have got mainframe results. I am doing this now.
public static void main(String[] args) throws FileNotFoundException {
// TODO Auto-generated method stub
String myString = new String("210003166");
byte[] num1 = new byte[16];
try {
PackDec.stringToPack("000000002765000",num1,0,15);
System.out.println("array size: " + num1.length);
} catch (DecimalOverflowException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (DataException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
byte[] ebc = null;
try {
ebc = myString.getBytes("Cp037");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
PrintWriter pw = new PrintWriter("C:/transformationTextV1.txt");
pw.printf("%x%x%x%x%x%x%x%x%x",ebc[0],ebc[1],ebc[2],ebc[3],ebc[4], ebc[5], ebc[6], ebc[7], ebc[8]);
pw.printf("%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x",num1[0],num1[1],num1[2],num1[3],num1[4], num1[5], num1[6], num1[7],num1[8], num1[9],num1[10], num1[11],num1[12], num1[13], num1[14],num1[15]);
pw.close();
}
And I get the following output:
Á.Á.Á.Á.Á.Á.Á.Á.Á.................Ä
63636363636363636333333333333333336444444444444444444444444444444444444444444444
62616060606361666600000000000276503000000000000000000000000000000000000000000000
I must be doing something very wrong!
UPDATE 4: 14th Jun 2011
This query was resolved after using James' suggestion. I am currently using the below code and it gives me the expected output:
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
String myString = new String("210003166");
byte[] num1 = new byte[16];
try {
PackDec.stringToPack("02765000",num1,0,8);
} catch (DecimalOverflowException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (DataException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
byte[] ebc = null;
try {
ebc = myString.getBytes("Cp037");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
FileOutputStream writer = new FileOutputStream("C:/transformedFileV3.txt");
writer.write(ebc,0,9);
writer.write(num1,0,8);
writer.close();
}
As you are coding in Java and you require a mix of EBCDIC and COMP-3 in your output you wiil need to do the unicode to EBCDIC conversion in your own program.
You cannot leave this up to the file transfer utility as it will corrupt your COMP-3 fields.
But luckily you are using Java so its easy using the getBytes method of the string class..
Working Example:
package com.tight.tran;
import java.io.*;
import name.benjaminjwhite.zdecimal.DataException;
import name.benjaminjwhite.zdecimal.DecimalOverflowException;
import name.benjaminjwhite.zdecimal.PackDec;
public class worong {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
String myString = new String("210003166");
byte[] num1 = new byte[16];
try {
PackDec.stringToPack("000000002765000",num1,0,15);
System.out.println("array size: " + num1.length);
} catch (DecimalOverflowException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (DataException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
byte[] ebc = null;
try {
ebc = myString.getBytes("Cp037");
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
FileOutputStream writer = new FileOutputStream("C:/transformedFile.txt");
writer.write(ebc,0,9);
writer.write(num1,0,15);
writer.close();
}
}
Produces (for me!):
0000000: f2f1 f0f0 f0f3 f1f6 f600 0000 0000 0000 ................
0000010: 0000 0000 2765 000c 0d0a ....'e....
"They were able to get the packed decimal columns without any problem but the normal number fields seemed to have messed up " would seem to indicate that they did not translate ASCII to EBCDIC.
ASCII zero x'30' should translate to EBCDIC zero x'F0'. If this was not done then (depending on the EBCDIC code page) then x'30' does not map to a valid character on most EBCDIC displays.
However even if they did translate you will have different problem as all or some of your COMP-3 data will be corrupted. The simple translate programs have no way to distinguish between character and comp-3 so they will convert a number such as x'00303C' to x'00F06E' which will cause any mainframe program to bomb out with the dreaded "0C7 Decimal Arithmetic Exception" ( culturally equivalent to "StackOverflow").
So basically you are in a lose/lose situation. I would suggest you ditch the packed decimals and use plain ASCII characters for your numbers.
The zipping should not cause you a problem, except, the file transfer utility was probably doing ASCII to EBCDIC on the plain text file, but, not on the zipped file.
"... converted to EBCDIC..." may be part of the problem.
Unless the mainframe conversion process is "aware" of the record layout it is
working with (ie. which columns contain binary, packed and/or character data),
it is going to mess something up because the mapping process is format dependant.
You have indicated the COMP-3 data are ok, I am willing to bet that either
the "converted to EBCDIC" doesn't do anything, or it is performing some sort of
ASCII to COMP-3 conversion on all of your data - thus messing up non COMP-3 data.
Once you get to the mainframe, this is what you should see:
COMP-3 - each byte contains 2 digits except the last (right most, least
significant). The least significant
byte contains only 1 decimal digit in the upper 4 bits and the sign field in the
lower 4 bits. Each decimal digit is recorded in hex (eg. 5 = B'0101')
Zoned Decimal (normal numbers) - each byte contains 1 decimal digit. The upper
four bits should always contain HEX F, except possibly the least most significant
byte where the upper 4 bits may contain the sign and the lower 4 bits a digit. The
4 bit digit is recored in hex (eg. 5 = B'0101')
You need to see what the un-zipped converted data look like on the mainframe.
Ask someone to "BROWSE" your file on the mainframe with "HEX ON" so you can
see what the actual HEX content of your file is. From there you should be able
to figure out what sort hoops and loops you need to jump through to make this
work.
Here are a couple of links that may be of help to you:
IBM Mainframe Numeric data representation
ASCII to EBCDIC chart
Update: If the mainframe guys can see the correct digits when browsing with
"HEX ON" then there are two possible problems:
Digit is stored in the wrong nibble. The digit should be visible in the
lower 4 bits. If it is in the upper 4 bits, that is definitely a problem.
The non-digit nibble (upper 4 bits) does not contain HEX 'F' or valid sign value.
Unsigned digits always contain HEX 'F' in the upper 4 bits of the byte. If the number
is signed (eg. PIC S9(4) - or something like that), the upper 4 bits of the least
most significant digit (last one) should contain HEX 'C' or 'D'.
Here is a bit of a screen shot of what BROWSE with 'HEX ON' should look like:
File Edit Edit_Settings Menu Utilities Compilers Test Help
VIEW USERID.TEST.DATA - 01.99 Columns 00001 00072
Command ===> Scroll ===> CSR
****** ***************************** Top of Data ******************************
000001 0123456789
FFFFFFFFFF44444444444444444444444444444444444444444444444444444444444444
012345678900000000000000000000000000000000000000000000000000000000000000
------------------------------------------------------------------------------
000002 |¬?"±°
012345678944444444444444444444444444444444444444444444444444444444444444
FFFFFFFFF000000000000000000000000000000000000000000000000000000000000000
------------------------------------------------------------------------------
000003 àíÃÏhr
012345678944444444444444444444444444444444444444444444444444444444444444
012345678900000000000000000000000000000000000000000000000000000000000000
------------------------------------------------------------------------------
The lines beginning with '000001', '000002' and '000003' shows 'plain' text. the two lines below
each of them show the HEX representation of the character above it. The first line of HEX
shows the upper 4 bits, the second line the lower 4 bits.
Line 1 contains the number '0123456789' followed by blank spaces (HEX 40).
Line 2 shows junk because the upper and lower nibbles are flipped. The exact silly character
is just a matter of code page selection so do not get carried away with what you see.
Line 3 shows similar junk because both upper and lower nibbles contain a digit.
Line '000001' is the sort of thing you should see for unsigned zoned decimal numbers
on an IBM mainframe using EBCDIC (single byte character set).
UPDATE 2
You added a HEX display to your question on June 6th. I think maybe there
were a couple of formatting issues. If this is what
you were trying to display, the following discussion might be of help to you:
..........A..
33333333326004444
210003166750C0000
You noted that this is a display of two "numbers":
210003166 in Zoned Decimal
000000002765000 in COMP-3
This is what an IBM mainframe would be expecting:
210003166 :Á : <-- Display character
FFFFFFFFF00002600 <-- Upper 4 bits of each byte
2100031660000750C <-- Lower 4 bits of each byte
Notice the differences between what you have and the above:
The upper 4 bits of the Zoned Decimal data in your display contain
a HEX '3', they should contain a HEx 'F'. The lower 4 bits contain the
expected digit. Get those upper 4 bits fixed
and you should be good to go. BTW... it looks to me that whatever 'conversion' you
have attempted to Zoned Decimal is having no affect. The bit patterns you have for
each digit in the Zoned Decimal correspond to digits in the ASCII character set.
In the COMP-3 field you indicated that the leading zeros could be truncated.
Sorry, but they are either part of the number or they are not! My display above
includes leading zeros. Your display appears to have truncated leading zeros and then padded
trailing bytes with spaces (HEX 40). This won't work! COMP-3 fields are defined
with a fixed number of digits and all digits must be represented - that means leading
zeros are required to fill out the high order digits of each number.
The Zoned Decimal fix should be pretty easy... The COMP-3 fix is probably just a
matter of not stripping leading zeros (otherwise it looks pretty good).
UPDATE 3...
How do you flip the 4 high order bits? I got the impression somewhere along the line that you might be doing your conversion via a Java program.
I, unfortunately, am a COBOL programmer, but I'll take a shot at it (don't
laugh)...
Based on what I have seen here, all you need to do is take each ASCII
digit and flip the high 4 bits to HEX F and the result will be the equivalent
unsighed Zoned Decimal EBCDIC digit. Try something like...
public static byte AsciiToZonedDecimal(byte b) {
//flip upper 4 bits to Hex F...
return (byte)(b | 0xF0)
};
Apply the above to each ASCII digit and the result should be an unsigned EBCDIC
Zoned Decimal number.
UPDATE 4...
At this point the answers provided by James Anderson should put you on the right track.
James pointed you to name.benjaminjwhite.zdecimal and
this looks like it has all the Java classes you need to convert your data. The
StringToZone method
should be able to convert the IDENTIFIER string you get back from Oracle into a byte array that you then append to the
output file.
I am not very familiar with Java but I believe Java Strings are stored internally as Unicode Characters which are 16 bits long. The EBCDIC
characters you are trying to create are only 8 bits long. Given this, you might be better off writting to the output file using byte arrays (as opposed to strings).
Just a hunch from a non Java programmer.
The toZoned method in your question above appears to only concern itself with the first
and last characters of the string. Part of the problem is that each and every character
needs to be converted - the 4 upper bits of each byte, except possibly the last, needs to be patched to contain Hex F. The lower 4 bits contain one digit.
BTW... You can pick up the source for this Java utility class at: http://www.benjaminjwhite.name/zdecimal
It sounds like the problem is in the EBCDIC conversion. The packed decimal would use characters as byte values, and isn't subject to the transliterations EBCDIC <-> ASCII.
If they see control characters (or square markers on Windows), then they may be viewing ASCII data as EBCDIC.
If they see " ñòóôõö øù" in place of "0123456789" then they are viewing EBCDIC characters in a viewer using ANSI, or extended ASCII.
I am use next type of strings:
LPCSTR, TCHAR, String i want to convert:
from TCHAR to LPCSTR
from String to char
I convert from TCHAR to LPCSTR by that code:
RunPath = TEXT("C:\\1");
LPCSTR Path = (LPCSTR)RunPath;
From String to char i convert by that code:
SaveFileDialog^ saveFileDialog1 = gcnew SaveFileDialog;
saveFileDialog1->Title = "Сохранение файла-настроек";
saveFileDialog1->Filter = "bck files (*.bck)|*.bck";
saveFileDialog1->RestoreDirectory = true;
pin_ptr<const wchar_t> wch = TEXT("");
if ( saveFileDialog1->ShowDialog() == System::Windows::Forms::DialogResult::OK ) {
wch = PtrToStringChars(saveFileDialog1->FileName);
} else return;
ofstream os(wch, ios::binary);
My problem is that when i set "Configuration Properties -> General
Character Set in "Use Multi-Byte Character Set" the first part of code work correctly. But the second part of code return error C2440. When i set "Configuration Properties -> General
Character Set in "Use Unicode" the second part of code work correctly. But the first part of code return the only first character from TCHAR to LPCSTR.
I'd suggest you need to be using Unicode the whole way through.
LPCSTR is a "Long Pointer to a C-type String". That's typically not what you want when you're dealing with .Net methods. The char type in .Net is 16bits wide.
You also should not use the TEXT("") macro unless you're planning multiple builds using various character encodings. Try wrapping all your string literals with the _W("") macro instead and a pure unicode build if you can.
See if that helps.
PS. std::wstring is very handy in your scenario.
EDIT
You see only one character because the string is now unicode but you cast it as a regular string. Many or most of the Unicode characters in the ASCII range has their same number as in ASCII but have the second of their 2 bytes set to zero. So when a unicode string is read as a C-string you only see the first character because C-strings are null ( zero ) terminated. The easy ( and wrong ) way to deal with this is to use std:wstring to cast as a std:string then pull the C-String out of that. This is not the safe approach because Unicode has a much large character space then your standard encoding.