Javacard J2A040 changing default key with GPShell script not work - javacard

I want to change default key but script below on GPShell return 6A80.
mode_211
enable_trace
establish_context
card_connect
select -AID A000000003000000
open_sc -scp 2 -scpimpl 0x15 -security 1 -keyind 0 -keyver 0 -mac_key 404142434445464748494A4B4C4D4E4F -enc_key 404142434445464748494A4B4C4D4E4F // Open secure channel
put_sc_key -keyver 0 -newkeyver 1 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
card_disconnect
release_context
pyResMan
What is wrong ?
my J2A040 is pre-personalised but not fused and not protect.
Thanks for your help

put_sc_key -keyver 0 -newkeyver 1 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
is creating a new key. Because the key in key set version 1 already exists, the command fails. To replace a key a key use this syntax:
put_sc_key -keyver 1 -newkeyver 1 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
If this fails it would be interesting for me to know if addign a new key set version works. Please try (adding key set version 2):
put_sc_key -keyver 0 -newkeyver 2 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
I think I have some issues left in the code, currently I'm investigating this, your support could be helpful here. Are you using the latest binaries release for Windows / Homebrew?

This script work for me now
mode_211
enable_trace
establish_context
card_connect
select -AID A000000003000000
open_sc -scp 2 -scpimpl 0x15 -security 1 -keyind 0 -keyver 0 -key 404142434445464748494A4B4C4D4E4F -mac_key 404142434445464748494A4B4C4D4E4F -enc_key 404142434445464748494A4B4C4D4E4F -kek_key 404142434445464748494A4B4C4D4E4F // Open secure channel
put_sc_key -keyver 1 -newkeyver 0 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
card_disconnect
release_context
With this :
put_sc_key -keyver 0 -newkeyver 2 -mac_key 404142434445464748494A4B4C4D4E4E -enc_key 404142434445464748494A4B4C4D4E4E -kek_key 404142434445464748494A4B4C4D4E4E -current_kek 404142434445464748494A4B4C4D4E4F
It work too.
But what i actually want is to replace the 3 default keys (S-ENC, S-MAC, DEK) and not add new keys, now I have 3 new keys with version 2, look on l 'picture.
Picture from new version 2 key pyResMan
Now how to delete keys for version 2

Related

Forwarding traffic from port ttyS3 to ttyUSB0 - input/output error

I am attempting to set up a basic pipe that'll transfer all data written to ttyS3 to ttyUSB0. I found a few solutions to the problem such as this, but they don't seem to help much. The issue seems to be that anytime I do anything with ttyS3, I get this:
stty: /dev/ttyS3: Input/output error
Doing ls -l /dev/ttyS* and the same for /dev/ttyUSB* I get the following:
root#arm-64:~# ls -l /dev/ttyS*
crw-rw---- 1 root dialout 4, 64 Feb 9 13:08 /dev/ttyS0
crw-rw---- 1 root dialout 4, 65 Feb 9 13:08 /dev/ttyS1
crw--w---- 1 root tty 4, 66 Feb 9 13:08 /dev/ttyS2
crw-rw---- 1 root dialout 4, 67 Feb 9 13:08 /dev/ttyS3
crw-rw---- 1 root dialout 4, 68 Feb 9 13:08 /dev/ttyS4
root#arm-64:~# ls -l /dev/ttyUSB*
crw-rw---- 1 root dialout 188, 0 Feb 9 13:08 /dev/ttyUSB0
I've created the following script to do the job for me at startup. I changed the major/minor values to match that of USB0 after reading somewhere that this could work as a pipe. Although it does execute without throwing an Input/output error, it doesn't seem to work as intended.
#!/bin/bash
rm /dev/ttyS3
mknod -m 666 /dev/ttyS3 c 188 0
chown root.dialout /dev/ttyS3
chmod 666 /dev/ttyS3
stty -F /dev/ttyUSB0 speed 115200 cs8
stty -F /dev/ttyS3 speed 115200 cs8
cat /dev/ttyS3 > /dev/ttyUSB0 &
I just need to create a basic pipe that'll take all data written to ttyS3 and pass it on to ttyUSB0. Although I don't think it's relevant, I'm running Armbian bullseye on a TV box (Tx3 Mini)
I just need to create a basic pipe that'll take all data written to ttyS3 and pass it on to ttyUSB0
Don't see a problem so long as each serial terminal is properly setup and functional/operational. Before you create the "pipe", did you verify that each serial terminal is operating properly?
On a SBC I have the console on a serial terminal, and established two more serial terminals using a SoC USART and a USB adapter:
# ls -l /dev/tty*S*
crw-rw---- 1 root dialout 246, 0 Jan 1 2012 /dev/ttyGS0
crw------- 1 root tty 4, 64 Jul 31 22:46 /dev/ttyS0
crw-rw---- 1 root dialout 4, 65 Jul 31 22:25 /dev/ttyS1
crw-rw---- 1 root dialout 188, 0 Jul 31 22:28 /dev/ttyUSB0
#
Note that the udev daemon created these device nodes, and no funny business (i.e. manual re-creating device nodes) was necessary to accomplish the "pipe".
To remove canonical processing, each serial terminal is put in raw mode and with matching baudrates:
# stty raw 115200 -F /dev/ttyUSB0
# stty raw 115200 -F /dev/ttyS1
A report of all termios settings:
# stty -aF /dev/ttyUSB0
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0;
-parenb -parodd -cmspar cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff
-iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt
echoctl echoke -flusho -extproc
#
# stty -aF /dev/ttyS1
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0;
-parenb -parodd -cmspar cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff
-iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt
echoctl echoke -flusho -extproc
#
Then when the command
# cat /dev/ttyS1 > /dev/ttyUSB0 &
is issued, whatever is typed on the remote terminal-emulator program connected to /dev/ttyS1 shows up on the remote terminal-emulator program connected to /dev/ttyUSB0.
This seems to behave like the desired "basic pipe that'll take all data written to ttyS? and pass it on to ttyUSB0".
Bottom line:
Unable to duplicate problems, and can create "pipe" of two serial links.
# uname -a
Linux sama5d2-xplained 5.4.81-linux4sam-2020.10 #1 Thu Jan 14 12:54:56 UTC 2021
armv7l armv7l armv7l GNU/Linux
#
The issue seems to be that anytime I do anything with ttyS3, I get this:
stty: /dev/ttyS3: Input/output error
... I'm running Armbian bullseye on a TV box (Tx3 Mini)
As previously mentioned, you need to verify that each serial terminal is operating properly.
Since a "TV box" doesn't really need five (!) serial terminals, you might be seeing/creating bogus device nodes that don't have any hardware to access.
Search the system log for the actual hardware that was initialized, e.g. 'dmesg | grep tty'. One of those UARTs might be used to interface to an IR receiver.

how to enable linux perf tool's branch sampling

I use linux perf tool to collect branch info of programs, and the command and result is as follow:
$ sudo perf record -b /bin/ls
Error:
No hardware sampling interrupt available.
No APIC? If so then you can boot the kernel with the "lapic" boot parameter to force-enable it.
the content in /pro/cpuinfo is below:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU E5405 # 2.00GHz
stepping : 10
microcode : 0xa07
cpu MHz : 1994.921
cache size : 6144 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 **apic** sep mtrr pge mca cmov pat pse36 clflush dts acpi strong text mmx fxsr sse sse2 ss ht tm
pbe syscall nx lm constant_tsc arch_perfmon pebs **bts** rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm
dca sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bugs :
bogomips : 3989.84
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual
power management:
apic and bts in flags entry is strengthened(I want but just encapsuled by "**") and I don't know what else is import for this case. And the other 7 processors are same to processor 0.
The boot parameter "lapic" is added by modifying /boot/grub/grub.cfg:
menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-0ed8a872-4eb7-4339-a0bb-6c0033da582e' {
recordfail
load_video
gfxmode $linux_gfx_mode
insmod gzio
insmod part_msdos
insmod ext2
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 ced80bc6-08a9-4909-9717-97658cf0c4fd
else
search --no-floppy --fs-uuid --set=root ced80bc6-08a9-4909-9717-97658cf0c4fd
fi
linux /vmlinuz-4.2.0-42-generic root=/dev/mapper/fedora_hustyong-root ro **lapic** quiet splash $vt_handoff
initrd /initrd.img-4.2.0-42-generic
}
just add lapic in linux entry.
But no sense after rebooting.
My questions:
1) What does the error info means?
2) Does the perf tool branch sampling use Intel Branch Trace Store(BTS)? Or Last Branch Record(LBR)?
3) How can I look up the LBR support?
4) what is different of the LBR and BTS support between x86 32bit and 64bit?
My OS is Ubuntu 14.04 64bit:
$ uname -a
Linux user-S5000VSA 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
The perf install instructions:
$ sudo apt-get install linux-tools-common
$ sudo apt-get install linux-tools-4.2.0-27-generic linux-cloud-tools-4.2.0-27-generic
update:
the content of /proc/interrupts:
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 127780 52084 127784 126729 127706 128431 127785 126822 IO-APIC 2-edge timer
1: 52 42 3 2 62 49 5 2 IO-APIC 1-edge i8042
8: 0 0 0 0 0 0 0 1 IO-APIC 8-edge rtc0
9: 0 0 0 0 0 0 0 0 IO-APIC 9-fasteoi acpi
12: 1428 1307 52 47 1424 1324 53 58 IO-APIC 12-edge i8042
14: 0 0 0 0 0 0 0 0 IO-APIC 14-edge ata_piix
15: 0 0 0 0 0 0 0 0 IO-APIC 15-edge ata_piix
17: 47 276 1004 49 52 295 993 50 IO-APIC 17-fasteoi radeon
20: 31062 4201 7533 29935 31080 4297 7540 29824 IO-APIC 20-fasteoi ata_piix
22: 0 0 0 0 0 0 0 0 IO-APIC 22-fasteoi uhci_hcd:usb3, uhci_hcd:usb5
23: 0 0 0 0 0 0 0 0 IO-APIC 23-fasteoi ehci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb4
25: 2 755654 3 3 1 1 3 6 PCI-MSI 2621440-edge eth0
27: 0 0 0 0 0 0 0 1 PCI-MSI 131072-edge ioat-msi
NMI: 6756 678 6894 5867 861 2168 4994 3700 Non-maskable interrupts
LOC: 343554 578094 1736638 773135 219952 777567 1459249 689292 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 6756 678 6894 5867 861 2168 4994 3700 Performance monitoring interrupts
IWI: 6756 678 6894 5867 861 2168 4994 3700 IRQ work interrupts
RTR: 0 0 0 0 0 0 0 0 APIC ICR read retries
RES: 82594 294601 142535 259797 77845 316210 84927 261455 Rescheduling interrupts
CAL: 4749 9296 7358 31330 7560 8564 5751 20364 Function call interrupts
TLB: 5933 2044 12867 11215 6563 4682 8669 8272 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts
DFR: 0 0 0 0 0 0 0 0 Deferred Error APIC interrupts
MCE: 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 292 292 292 292 292 292 292 292 Machine check polls
HYP: 0 0 0 0 0 0 0 0 Hypervisor callback interrupts
ERR: 0
MIS: 0
PIN: 0 0 0 0 0 0 0 0 Posted-interrupt notification event
PIW: 0 0 0 0 0 0 0 0 Posted-interrupt wakeup event
I install ubuntu 16.10 64bit in my PC and run perf record -b successfully. I think maybe it's wrong in kernel or linux-tools-4.2.0-27-generic or linux-cloud-tools-4.2.0-27-generic package.

gpshell "delete_key" command return 6a80 (Wrong data)

I imported several GlobalPlatform keys in my javacard with different key version. i can create secure channel by new key but when i want delete one of them i receive sw=6a80 result. My script is:
mode_211
enable_trace
establish_context
card_connect -readerNumber 1
select -AID A000000018434D00
open_sc -security 0 -keyind 0 -keyver 02 -mac_key 47454d5850524553534f53414d504c45-enc_key 47454d5850524553534f53414d504c45-kek_key 47454d5850524553534f53414d504c45// Open secure channel
delete_key -keyver 08 -keyind 0
get_status -element 40
card_disconnect
release_contex
I also try second script which open secure channel with same key but the result is sw = 6a80,
mode_211
enable_trace
establish_context
card_connect -readerNumber 1
select -AID A000000018434D00
open_sc -security 0 -keyind 0 -keyver 08 -mac_key 404142434445464748494a4b4c4d4e4f -enc_key 404142434445464748494a4b4c4d4e4f -kek_key 404142434445464748494a4b4c4d4e4f // Open secure channel
delete_key -keyver 08 -keyind 0
get_status -element 40
card_disconnect
release_context
the apdu trace is:
Command --> 80CA006600
Wrapped command --> 80CA006600
Response <-- 664C734A06072A864886FC6B01600C060A2A864886FC6B02020101630906072A864
886FC6B03640B06092A864886FC6B040105650B06092B8510864864020103660C060A2B060104012
A026E01029000
Command --> 8050020008919F9B915C23C5D600
Wrapped command --> 8050020008919F9B915C23C5D600
Response <-- 4D0022840106A57C224F020137AFC43375EF54A1A60DF8A01B351A189000
Command --> 8482000010E61BDA493C17D649ED414E4AD2356F3C
Wrapped command --> 8482000010E61BDA493C17D649ED414E4AD2356F3C
Response <-- 9000
delete_key -keyver 08 -keyind 0
Command --> 80E4000006D00100D2010800
Wrapped command --> 80E4000006D00100D2010800
Response <-- 6A80
delete_key() return 0x80206A80 (6A80: Wrong data / Incorrect values in command d
ata.)
get_status -element 40
Command --> 80F24000024F0000
Wrapped command --> 80F24000024F0000
Response <-- 09A0000003080000100007049000
Anyone can help me. Thanks a lot.

change Global Platform default key set of my Java Card

I have finished my applet and I want to use GPShell to change card's default key set to prevent another person replace or delete my applet.
my script to do so is as follows:
mode_211
enable_trace
establish_context
enable_trace
card_connect
open_sc -security 1 -keyind 0 -keyver 0 -mac_key 404142434445464748494a4b4c4d4e4f -enc_key 404142434445464748494a4b4c4d4e4f // Open secure channel
put_sc_key -keyver 1 -newkeyver 1 -mac_key 404142434445464748494a4b4c4d4e4e -enc_key 404142434445464748494a4b4c4d4e4e -kek_key 404142434445464748494a4b4c4d4e4e -cur_kek 404142434445464748494a4b4c4d4e4f
card_disconnect
release_context
but when I try this script, GPShell returns me the following error:
mode_211
enable_trace
establish_context
enable_trace
card_connect
open_sc -security 1 -keyind 0 -keyver 0 -mac_key 404142434445464748494a4b4c4d4e4f -enc_key 404142434445464748494a4b4c4d4e4f // Open secure channel
Command --> 80CA006600
Wrapped command --> 80CA006600
Response <-- 664C734A06072A864886FC6B01600C060A2A864886FC6B02020101630906072A864886FC6B03640B06092A864886FC6B040215650B06092B8510864864020103660C060A2B060104012A026E01029000
Command --> 80500000089AA60E4925924D6900
Wrapped command --> 80500000089AA60E4925924D6900
Response <-- 000011370001AB741C0BFF02047E4413D6E4873750AB69F325A1E4FF9000
Command --> 848201001056D480DA94FF6A33778F6D68A7497C8C
Wrapped command --> 848201001056D480DA94FF6A33778F6D68A7497C8C
Response <-- 9000
put_sc_key -keyver 1 -newkeyver 1 -mac_key 404142434445464748494a4b4c4d4e4e -enc_key 404142434445464748494a4b4c4d4e4e -kek_key 404142434445464748494a4b4c4d4e4e -cur_kek 404142434445464748494a4b4c4d4e4f
Error: unknown option -cur_kek
can anyone help to solve the problem? is any of my options wrong? can you write me the correct script for GPShell?
Thanks in advance.
Try -current_kek instead of -cur_kek as there seems to be a typo in the gpshell documentation.
The relevant part of the source code is here.

"DebugInfo for CritSec does not point back to the critical section" when analysing deadlock

I'm using Windbg to analyse a deadlock occurring in an data-snap application server written in delphi.
When I run
!analyze -hang -v
I get this
:000:x86> !analyze -hang -v
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************
GetPageUrlData failed, server returned HTTP status 404
URL requested: http://watson.microsoft.com/00000000.htm?Retriage=1
FAULTING_IP:
+6ced240
00000000 ?? ???
EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 0000000000000000
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 0
FAULTING_THREAD: 0000000000000000
BUGCHECK_STR: HANG
DEFAULT_BUCKET_ID: APPLICATION_HANG
PROCESS_NAME: ********.exe
ERROR_CODE: (NTSTATUS) 0xcfffffff -
EXCEPTION_CODE: (NTSTATUS) 0xcfffffff -
MOD_LIST:
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
DERIVED_WAIT_CHAIN:
Dl Eid Cid WaitType
-- --- ------- --------------------------
0 c7c.2634 Critical Section
WAIT_CHAIN_COMMAND: ~0s;k;;
BLOCKING_THREAD: 0000000000002634
PRIMARY_PROBLEM_CLASS: APPLICATION_HANG
LAST_CONTROL_TRANSFER: from 0000000077138df4 to 000000007711f8b1
STACK_TEXT:
0018fc50 77138df4 00000c6c 00000000 00000000 ntdll_77100000!NtWaitForSingleObject+0x15
0018fcb4 77138cd8 00000000 00000000 03fe0940 ntdll_77100000!RtlpWaitOnCriticalSection+0x13e
0018fcdc 7369324f 736a3134 00000000 03fe0940 ntdll_77100000!RtlEnterCriticalSection+0x150
WARNING: Stack unwind information not available. Following frames may be wrong.
0018fcec 7369af5f 00000388 00000000 003d1e00 mswsock!GetLspGuid+0x19af
0018fd08 76366958 00000388 0018fd84 0018fd9c mswsock!GetLspGuid+0x96bf
0018fd38 0018fd58 763668cd 00000388 0018fd84 ws2_32!WSAAccept+0x84
00000000 00000000 00000000 00000000 00000000 0x18fd58
FOLLOWUP_IP:
mswsock!GetLspGuid+19af
7369324f 33db xor ebx,ebx
SYMBOL_STACK_INDEX: 3
SYMBOL_NAME: mswsock!GetLspGuid+19af
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: C:\Windows\System32\mswsock
IMAGE_NAME: lld
DEBUG_FLR_IMAGE_TIMESTAMP: 4ce7c83d
STACK_COMMAND: ~0s ; kb
FAILURE_BUCKET_ID: APPLICATION_HANG_cfffffff_lld!Unloaded
BUCKET_ID: X64_HANG_mswsock!GetLspGuid+19af
WATSON_STAGEONE_URL: http://watson.microsoft.com/00000000.htm?Retriage=1
Followup: MachineOwner
---------
I then did
!locks -V
to see which lock it was waiting on and to my surprise it returned this,
0:000:x86> !locks -V
CritSec ntdll!RtlCriticalSectionLock+0 at 0000000077057060
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec ntdll!LdrpLoaderLock+0 at 0000000077057490
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec ntdll!RtlpDynamicFunctionTableLock+0 at 0000000077057468
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec ntdll!FastPebLock+0 at 000000007705a900
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec ntdll!RtlpProcessHeapsListLock+0 at 000000007705a240
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec +270208 at 0000000000270208
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 1
CritSec ntdll!EtwProvCritSect+0 at 000000007705a120
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec ntdll!EtwPrivSessionCritSect+0 at 000000007705a1e0
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec +10208 at 0000000000010208
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
CritSec +276f40 at 0000000000276f40
LockCount NOT LOCKED
RecursionCount 0
OwningThread 0
EntryCount 0
ContentionCount 0
Scanned 10 critical sections
From looking at the call stack
STACK_TEXT:
0018fc50 77138df4 00000c6c 00000000 00000000 ntdll_77100000!NtWaitForSingleObject+0x15
0018fcb4 77138cd8 00000000 00000000 03fe0940 ntdll_77100000!RtlpWaitOnCriticalSection+0x13e
0018fcdc 7369324f 736a3134 00000000 03fe0940 ntdll_77100000!RtlEnterCriticalSection+0x150
WARNING: Stack unwind information not available. Following frames may be wrong.
0018fcec 7369af5f 00000388 00000000 003d1e00 mswsock!GetLspGuid+0x19af
0018fd08 76366958 00000388 0018fd84 0018fd9c mswsock!GetLspGuid+0x96bf
0018fd38 0018fd58 763668cd 00000388 0018fd84 ws2_32!WSAAccept+0x84
00000000 00000000 00000000 00000000 00000000 0x18fd58
I determined it was waiting on a critical section at address 0x736a3134 (First parameter passed to RtlEnterCriticalSection) so I ran this
!critsec 736a3134
That gave me this output
0:000:x86> !critsec 736a3134
DebugInfo for CritSec at 00000000736a3134 does not point back to the critical section
NOT an initialized critical section.
CritSec mswsock!WSPStartup+6f64 at 00000000736a3134
WaiterWoken Yes
LockCount -1
RecursionCount 11028
OwningThread c6c
EntryCount 1f49dad6
ContentionCount 88000000
*** Locked
Now the penny dropped, the pointer to the critical section has become corrupted, possibly due to concurrent thread access and lack of synchronisation elsewhere in the code
My question is how do I track down where this is or find out if it is another problem?
PS: this bug only appears when the application is under heavy load with maybe 700 clients connected
(it is using one thread per connection and I know 32bit applications will be limited to aprox 2000 threads at the default thread stack size and this is not the best approach)
PPS: I have multiple crash dumps where the application is hung waiting on different critical sections, in each case the pointer for the critical section appears not to point to a critical section.
Looking at the output of !analyze -hang -v, it seems you are not using Application Verifier. I would recommend you to collect a hang dump after enabling application verifier "Locks" option. It would certainly give you more information for troubleshooting.
You can download Application Verifier from here:
http://www.microsoft.com/en-us/download/details.aspx?id=20028
More information:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd371695(v=vs.85).aspx
Just to let you know we gave up trying to find out what was causing this. As it only happened when the program was close to its Max virtual memory space (2.1GB 32bit app) due to the one thread per connection approach we where using.
In the end we re-designed the clients so they are not using this server application any-more but now use a SOAP server instead.
The SOAP server seams to scale much better than the datasnap/Midas we where using although we are still to test it at the clients site where the initial problem appeared.

Resources