jmeter runs for scheduler duration but hangs at end - multithreading

I'm running jmeter from the command line with a 300 second duration.
However it rarely finishes the whole job and returns to the command line - I mostly have to cancel it.
This is what I see:
C:\dev\tools\apache-jmeter-3.1\bin>jmeter.bat -n -t c:/dev/workspace/docs/JMeter-stress2.jmx -j c:/dev/log/jmeter.log -l c:/dev/log/jmeter-results.csv
Writing log file to: c:\dev\log\jmeter.log
Creating summariser <summary>
Created the tree successfully using c:/dev/workspace/docs/JMeter-stress2.jmx
Starting the test # Tue Mar 07 15:43:07 GMT 2017 (1488901387136)
Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445
summary + 1573 in 00:00:23 = 69.0/s Avg: 166 Min: 47 Max: 2175 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary + 2135 in 00:00:30 = 71.3/s Avg: 150 Min: 44 Max: 4022 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 3708 in 00:00:53 = 70.3/s Avg: 157 Min: 44 Max: 4022 Err: 0 (0.00%)
summary + 2039 in 00:00:30 = 68.0/s Avg: 187 Min: 44 Max: 31024 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 5747 in 00:01:23 = 69.4/s Avg: 168 Min: 44 Max: 31024 Err: 0 (0.00%)
summary + 2051 in 00:00:30 = 68.3/s Avg: 168 Min: 41 Max: 30813 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 7798 in 00:01:53 = 69.2/s Avg: 168 Min: 41 Max: 31024 Err: 0 (0.00%)
summary + 2296 in 00:00:30 = 76.5/s Avg: 168 Min: 41 Max: 32443 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 10094 in 00:02:23 = 70.7/s Avg: 168 Min: 41 Max: 32443 Err: 0 (0.00%)
summary + 1015 in 00:00:30 = 33.8/s Avg: 348 Min: 42 Max: 30255 Err: 5 (0.49%) Active: 12 Started: 12 Finished: 0
summary = 11109 in 00:02:53 = 64.3/s Avg: 184 Min: 41 Max: 32443 Err: 5 (0.05%)
summary + 1880 in 00:00:30 = 62.6/s Avg: 177 Min: 41 Max: 30265 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 12989 in 00:03:23 = 64.1/s Avg: 183 Min: 41 Max: 32443 Err: 5 (0.04%)
summary + 1499 in 00:00:30 = 50.0/s Avg: 262 Min: 41 Max: 30417 Err: 5 (0.33%) Active: 12 Started: 12 Finished: 0
summary = 14488 in 00:03:53 = 62.2/s Avg: 191 Min: 41 Max: 32443 Err: 10 (0.07%)
summary + 2383 in 00:00:30 = 79.4/s Avg: 148 Min: 42 Max: 3687 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 16871 in 00:04:23 = 64.2/s Avg: 185 Min: 41 Max: 32443 Err: 10 (0.06%)
summary + 1870 in 00:00:30 = 62.3/s Avg: 172 Min: 41 Max: 30890 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary = 18741 in 00:04:53 = 64.0/s Avg: 184 Min: 41 Max: 32443 Err: 10 (0.05%)
summary + 483 in 00:00:35 = 14.0/s Avg: 344 Min: 43 Max: 31082 Err: 3 (0.62%) Active: 1 Started: 12 Finished: 11
summary = 19224 in 00:05:27 = 58.7/s Avg: 188 Min: 41 Max: 32443 Err: 13 (0.07%)
Terminate batch job (Y/N)? y
The last line of output before I cancel it hangs there indefinitely until I kill it.
The errors are from kerberos, which doesn't have a good reputation in this organisation :( It puts the error logging into the *.csv output file which makes it unusable, but I guess that's a different question. I only mention it because it might be the cause of the hanging.
This is what I see in the end of the log file. Notice the timestamp of the shutdown message - the log statement before that is the last before it hangs. The errors in the logging stem from connection problems with the kerberos server.
2017/03/07 15:48:00 INFO - jmeter.reporters.Summariser: summary + 1870 in 00:00:30 = 62.3/s Avg: 172 Min: 41 Max: 30890 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
2017/03/07 15:48:00 INFO - jmeter.reporters.Summariser: summary = 18741 in 00:04:53 = 64.0/s Avg: 184 Min: 41 Max: 32443 Err: 10 (0.05%)
2017/03/07 15:48:04 ERROR - jmeter.protocol.http.sampler.HTTPHC4Impl: Can't execute httpRequest with subject:Subject:
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET get_forecast 5-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET get_forecast 5-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET get_forecast 5-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET get_forecast 5-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET forecast with history 4-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET forecast with history 4-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST data/save 2-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: POST data/save 2-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST forecast/save 3-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: POST forecast/save 3-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET forecast with history 4-3
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET forecast with history 4-3
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST data/save 2-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: POST data/save 2-1
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET forecast with history 4-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET forecast with history 4-2
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: GET get_forecast 5-3
2017/03/07 15:48:07 INFO - jmeter.threads.JMeterThread: Thread finished: GET get_forecast 5-3
2017/03/07 15:48:08 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST data/save 2-3
2017/03/07 15:48:08 INFO - jmeter.threads.JMeterThread: Thread finished: POST data/save 2-3
2017/03/07 15:48:13 ERROR - jmeter.protocol.http.sampler.HTTPHC4Impl: Can't execute httpRequest with subject:Subject:
2017/03/07 15:48:13 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST forecast/save 3-3
2017/03/07 15:48:13 INFO - jmeter.threads.JMeterThread: Thread finished: POST forecast/save 3-3
2017/03/07 15:48:34 INFO - jmeter.reporters.Summariser: summary + 483 in 00:00:35 = 14.0/s Avg: 344 Min: 43 Max: 31082 Err: 3 (0.62%) Active: 1 Started: 12 Finished: 11
2017/03/07 15:48:34 INFO - jmeter.reporters.Summariser: summary = 19224 in 00:05:27 = 58.7/s Avg: 188 Min: 41 Max: 32443 Err: 13 (0.07%)
2017/03/07 15:48:34 INFO - jmeter.threads.JMeterThread: Stopping because end time detected by thread: POST forecast/save 3-2
2017/03/07 15:48:34 INFO - jmeter.threads.JMeterThread: Thread finished: POST forecast/save 3-2
2017/03/07 15:51:21 INFO - jmeter.reporters.ResultCollector: Shutdown hook started
2017/03/07 15:51:21 INFO - jmeter.reporters.ResultCollector: Shutdown hook ended
Update 2017-03-10
Only progress in defining the problem better :(
Why isn't JMeter dumping the connections when I set the connection time-out to 5 secs (connect) + 5 secs (response) on the HTTP request defaults dialog.
Why do I see a max time on the JMeter output of > 30000ms despite the JMeter connection time-out?
Why do I see no stack traces on the server-side? Possibly exceptions getting swallowed maybe.

Have you tried to lower the simulated load? I see that max response time is around 30 sec., JMeter may fail to stop after the desired because some threads may be blocked waiting for several server responses.
You should also use jvisualvm to monitor JMeter's VM while running the load test to ensure there is enough memory available, as well as look at waiting threads. It may help you to find out the issue.

Related

In Excel, how do I select specific rows in a table based on matching criteria in a separate table

I have an Excel sheet of account records (table 1) and I need to select only those rows that meet/match multiple criteria conditions in a separate Excel sheet (table 2)
For example, I need to select only the rows in table 1 where the "BalanceBand" is a match between the two tables, the "ScoreBand" is a match between the two tables, AND the "Selected Units" in the second table is greater than 0. So, I am taking a single row at a time in the account table and need to run it through the entire second table to see if there is a match.
TABLE 1
Account
BuyID
Status
Balance
BalanceBand
ExperianScore
ExperianScoreBand
State
4564077
PEN033
DECEASED
12532.74
$10,000-20,000
0
UNSCORED
MN
4564078
PEN033
PTPPLC
20618.8
$20,000+
713
700-750
GA
4564079
PEN033
PTPPLC
1601.21
$1,000-2,000
623
600-650
PR
4564080
PEN033
JORMANDY
26378.45
$20,000+
619
600-650
CO
4564081
PEN033
PTPPLC
17330.38
$10,000-20,000
647
600-650
TX
4564082
PEN033
NEWACCTPLC
1594.42
$1,000-2,000
UNSCORED
TX
4564083
PEN033
PTPPLC
20097.07
$20,000+
622
600-650
MD
TABLE 2
Balance Band
ScoreBand
Units
Balance
Weight %
GL %
GL $
CC
CC Recoupment
Contingency Fee
NL
ROI
Units
Selected Units
$1,000-2,000
400-450
1
1,606
0.03%
15%
239
165
79
64
88
0.53x
1
-
$1,000-2,000
450-500
0
-
0.00%
56%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
500-550
0
-
0.00%
24%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
550-600
2
3,756
0.07%
24%
894
330
158
241
480
1.46x
2
-
$1,000-2,000
600-650
4
6,457
0.12%
30%
1,955
660
316
528
1,083
1.64x
4
-
$1,000-2,000
650-700
0
-
0.00%
28%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
700-750
0
-
0.00%
40%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
750-800
0
-
0.00%
32%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
800-850
0
-
0.00%
0%
-
-
-
-
-
0.00x
0
-
$1,000-2,000
Unscored
43
68,235
1.23%
6%
4,096
7,095
3,397
1,106
(708)
-0.10x
43
-
$2,000-3,000
400-450
3
8,374
0.15%
12%
1,032
495
237
279
495
1.00x
3
-
$2,000-3,000
450-500
8
19,512
0.35%
11%
2,092
1,320
632
565
839
0.64x
8
-
$2,000-3,000
500-550
10
26,743
0.48%
13%
3,574
1,650
790
965
1,749
1.06x
10
-
$2,000-3,000
550-600
10
25,808
0.47%
13%
3,259
1,650
790
880
1,519
0.92x
10
-
$2,000-3,000
600-650
2
4,674
0.08%
10%
462
330
158
125
166
0.50x
2
-
$2,000-3,000
650-700
1
2,609
0.05%
12%
322
165
79
87
149
0.90x
1
-
$2,000-3,000
700-750
1
2,389
0.04%
13%
311
165
79
84
141
0.85x
1
-
$2,000-3,000
750-800
1
2,904
0.05%
47%
1,379
165
79
372
920
5.58x
1
1
$2,000-3,000
800-850
0
-
0.00%
0%
-
-
-
-
-
0.00x
0
-
$2,000-3,000
Unscored
11
26,643
0.48%
8%
2,161
1,815
869
584
632
0.35x
11
-
$3,000-5,000
400-450
5
19,239
0.35%
57%
10,959
825
395
2,959
7,570
9.18x
5
5
$3,000-5,000
450-500
20
75,466
1.36%
20%
15,238
3,300
1,580
4,114
9,404
2.85x
20
20
I tried many combinations of INDEX, MATCH, COUNTIF, and VLOOKUP but could never get it to work like I wanted. Any help is greatly appreciated.
This was quite challenging, since the first and second table don't have the matching ID's, but this should do the trick:
=FILTER(Table1,
MMULT(
(TRANSPOSE(Table4[Balance Band])=Table1[BalanceBand])*(TRANSPOSE(Table4[ScoreBand])=Table1[ExperianScoreBand]),
--(ISNUMBER(Table4[Selected Units]))),
"no matches")
Alternatively:
=FILTER(Table1,
BYROW(Table1,LAMBDA(b,LET(bb,INDEX(b,,5),sb,INDEX(b,,7),
SUMPRODUCT((Table4[Balance Band]=bb)*(Table4[ScoreBand]=sb)*(ISNUMBER(Table4[Selected Units])))))))

Jmeter Script executes but threads are not finishing

We are running jmeter performance test scripts on Linux servers. It executes and produces results but threads are not finishing since the beginning of steps . We see FINISHED = 0 in the initial steps and of some value at the end.
Any feedback on why threads are not finishing ?
Sample test result
sh apache-jmeter-2.13/bin/jmeter.sh -n -t "PKS10-test-060221.jmx" -l jmeter_log.log -JL7.rampup=10 -JL7.duration=900 -JL7.thread_delay=0 -JL7.validations_per_issuance=100 -JL7.threads=75 | tee jmeter_console.log
Creating summariser
Created the tree successfully using PKS10-test-060221.jmx
Starting the test
Waiting for possible shutdown message on port 4445
summary + 1560 in 15.1s = 103.2/s Avg: 471 Min: 192 Max: 3710 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary + 4846 in 30s = 161.5/s Avg: 463 Min: 224 Max: 1619 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 6406 in 45.1s = 142.0/s Avg: 465 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4500 in 30s = 150.0/s Avg: 500 Min: 223 Max: 1954 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 10906 in 75.1s = 145.2/s Avg: 480 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4731 in 30s = 157.7/s Avg: 475 Min: 223 Max: 1824 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 15637 in 105s = 148.8/s Avg: 478 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4537 in 30s = 151.2/s Avg: 496 Min: 204 Max: 2109 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 20174 in 135s = 149.3/s Avg: 482 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4687 in 30s = 156.2/s Avg: 479 Min: 223 Max: 2064 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 24861 in 165s = 150.6/s Avg: 482 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4642 in 30s = 154.7/s Avg: 484 Min: 223 Max: 1754 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 29503 in 195s = 151.2/s Avg: 482 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4521 in 30s = 150.7/s Avg: 497 Min: 197 Max: 2167 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 34024 in 225s = 151.1/s Avg: 484 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4662 in 30s = 155.4/s Avg: 483 Min: 224 Max: 1712 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 38686 in 255s = 151.6/s Avg: 484 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4478 in 30s = 149.2/s Avg: 503 Min: 221 Max: 2655 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 43164 in 285s = 151.4/s Avg: 486 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4355 in 30s = 145.2/s Avg: 516 Min: 222 Max: 2079 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 47519 in 315s = 150.8/s Avg: 489 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4682 in 30s = 155.8/s Avg: 479 Min: 223 Max: 2092 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 52201 in 345s = 151.2/s Avg: 488 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4466 in 30.1s = 148.3/s Avg: 505 Min: 213 Max: 3105 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 56667 in 375s = 151.0/s Avg: 489 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4461 in 30s = 149.5/s Avg: 503 Min: 223 Max: 2508 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 61128 in 405s = 150.9/s Avg: 490 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 3926 in 30s = 130.9/s Avg: 573 Min: 223 Max: 2266 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 65054 in 435s = 149.5/s Avg: 495 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4635 in 30s = 154.4/s Avg: 486 Min: 223 Max: 1854 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 69689 in 465s = 149.8/s Avg: 494 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4287 in 30s = 142.9/s Avg: 524 Min: 222 Max: 2324 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 73976 in 495s = 149.4/s Avg: 496 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4997 in 30s = 166.6/s Avg: 447 Min: 195 Max: 2201 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 78973 in 525s = 150.4/s Avg: 493 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4985 in 30s = 166.2/s Avg: 452 Min: 222 Max: 2160 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 83958 in 555s = 151.2/s Avg: 491 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4795 in 30s = 159.8/s Avg: 470 Min: 201 Max: 2118 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 88753 in 585s = 151.7/s Avg: 489 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4899 in 30.1s = 162.9/s Avg: 458 Min: 221 Max: 2347 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 93652 in 615s = 152.2/s Avg: 488 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4715 in 30s = 157.6/s Avg: 477 Min: 222 Max: 2109 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 98367 in 645s = 152.5/s Avg: 487 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4602 in 30s = 153.4/s Avg: 488 Min: 222 Max: 2142 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 102969 in 675s = 152.5/s Avg: 487 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 4724 in 30s = 157.5/s Avg: 476 Min: 202 Max: 3099 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 107693 in 705s = 152.7/s Avg: 487 Min: 192 Max: 3710 Err: 0 (0.00%)
summary + 3963 in 30s = 132.1/s Avg: 567 Min: 193 Max: 3743 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 111656 in 735s = 151.9/s Avg: 490 Min: 192 Max: 3743 Err: 0 (0.00%)
summary + 4840 in 30s = 161.3/s Avg: 463 Min: 222 Max: 2726 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 116496 in 765s = 152.3/s Avg: 489 Min: 192 Max: 3743 Err: 0 (0.00%)
summary + 4493 in 30s = 149.7/s Avg: 501 Min: 222 Max: 2199 Err: 0 (0.00%) Active: 75 Started: 75 Finished: 0
summary = 120989 in 795s = 152.2/s Avg: 489
summary = 135136 in 885s = 152.7/s Avg: 488 Min: 192 Max: 3743 Err: 0 (0.00%)
summary + 2313 in 15.4s = 150.2/s Avg: 495 Min: 222 Max: 2085 Err: 0 (0.00%) Active: 0 Started: 75 Finished: 75
summary = 137449 in 901s = 152.6/s Avg: 488 Min: 192 Max: 3743 Err: 0 (0.00%)
You're looking at output of JMeter's summariser which shows the number of
number of started threads
number of active threads
number of finished threads
number of executed samplers
throughput
average, minimum and maximum response times
percentage of errors.
It looks like you're running your test with 75 threads (virtual users), once JMeter launches the thread it starts executing Samplers upside down (or according to Logic Controllers). When there are no more samplers to execute and loops to iterate the thread is being shut down.
So Finished: 0 lines mean that there are 75 active threads which are executing Samplers and none of them has finished its job yet.

Getting the linux time since power on

I've got
time = os.popen("tuptime").readline()
and the output should be something like:
System startups: 1 since 21:32:26 01.05.2020
System shutdowns: 0 ok - 0 bad
System uptime: 100.0% - 1 day, 16 hours, 49 minutes, 41 seconds
System downtime: 0.0% - 0 seconds
System life: 1 day, 16 hours, 49 minutes, 41 seconds
Largest uptime: 1 day, 16 hours, 49 minutes, 41 seconds from 21:32:26 01.05.2020
Shortest uptime: 1 day, 16 hours, 49 minutes, 41 seconds from 21:32:26 01.05.2020
Average uptime: 1 day, 16 hours, 49 minutes, 41 seconds
Largest downtime: 0 seconds
Shortest downtime: 0 seconds
Average downtime: 0 seconds
Current uptime: 1 day, 16 hours, 49 minutes, 41 seconds since 21:32:26 01.05.2020
how can i get and print out the "System life"?
you can use "tuptime| grep -i 'system life'" in the cmd instead of "tuptime"

Find Largest Value of Range With Criteria and Return Value of Another Cell

I need to figure out the formula to get last messageis based on memberid within a threadid.
Below is an example. Basically the first instance of a memberid within a threadid I need to return the value of messageid in "Last Post ID"
I have a database with tens of thousands of messageid and threadid, so I can't do it manually like I did below. I'm not sure how to set up a forumla or macro to perform this task. Any help is appreciated.
messageid threadid memberid # of Posts By ID Last Post ID
4332 3304 39 1 4332
4678 3304 231 1 4678
5383 3304 363 16 5383
5289 3304 363 15
5240 3304 363 14
5082 3304 363 13
4990 3304 363 12
4479 3304 363 11
4478 3304 363 10
4477 3304 363 9
4330 3304 363 8
4329 3304 363 7
3944 3304 363 6
3732 3304 363 5
3730 3304 363 4
3446 3304 363 3
3396 3304 363 2
3304 3304 363 1
4343 3304 436 4 4343
4185 3304 436 3
3816 3304 436 2
3696 3304 436 1
5010 3304 504 1 5010
5946 3304 522 1 5946
5409 3304 533 9 5409
5302 3304 533 8
5260 3304 533 7
5215 3304 533 6
4362 3304 533 5
3804 3304 533 4
3471 3304 533 3
3403 3304 533 2
3342 3304 533 1
3682 3304 821 1 3682
4151 3304 984 1 4151
3751 3304 1184 1 3751
5790 3304 1350 1 5790
5399 3304 1509 1 5399
7199 3304 2042 1 7199

OutOfMemory and Memory Fragmentation in SharePoint 2007 32 bit

for some weeks I struggling with the OutOfMemory issue on our SharePoint 2007 (published intranet with many customizations) WFEs (SP 2 and Win 2003 32 Bit Servers). After I received a crashed memory dump I found out that we have a memory fragmentation issue. For the dump analysis I use the following two tools: DiagDebug and Windbg with sos.dll.
Result: 96,74% Free Memory Fragmentation
DebugDiag (Memory Pressure Analyzers)
Virtual Memory Summary
Size of largest free VM block 23,63 MBytes
Free memory fragmentation 96,74%
Free Memory 725,52 MBytes (35,43% of Total Memory)
Reserved Memory 406,88 MBytes (19,87% of Total Memory)
Committed Memory 915,54 MBytes (44,71% of Total Memory)
Total Memory 2,00 GBytes
Largest free block at 0x00000000`4b0b0000
DebugDiag (SharePoint Analyzers)
Undisposed SPRequest objects: 9
Disposed SPRequest objects: 187
Undisposed SPWeb objects: 185
Disposed SPWeb objects: 34
Undisposed SPSite objects: 8
Disposed SPSite objects: 22
undisposed special purpose (AllowCleanupWhenThreadEnds = false) SPRequest object found at: 0x02320dd0.
undisposed special purpose (AllowCleanupWhenThreadEnds = false) SPRequest object found at: 0x4e8296f8.
undisposed special purpose (AllowCleanupWhenThreadEnds = false) SPRequest object found at: 0x4e869a20.
undisposed SPWeb object 0x02701168 references a disposed or invalid SPRequest object: 0x0270137c
undisposed SPWeb object 0x027013cc references a disposed or invalid SPRequest object: 0x027015e0
undisposed SPWeb object 0x02720824 references a disposed or invalid SPRequest object: 0x02720a20
undisposed SPWeb object 0x02d2aa74 references a disposed or invalid SPRequest object: 0x02d2ac70
...
Undisposed SPRequest Objects per managed Thread:
Thread ID: 6714, Undisposed SPRequest: 4
Thread ID: 4c68, Undisposed SPRequest: 3
Thread ID: 5e8c, Undisposed SPRequest: 1
Thread ID: 6180, Undisposed SPRequest: 1
So now I would like to understand what causes the memory fragmentation. Hope you can help me. These are the steps I did to get the right information.
Windbg with sos.dll
!address summary
-------------------- Usage SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Pct(Busy) Usage
2c748000 ( 728352) : 34.73% 53.79% : RegionUsageIsVAD
2d584000 ( 742928) : 35.43% 00.00% : RegionUsageFree
f987000 ( 255516) : 12.18% 18.87% : RegionUsageImage
10fc000 ( 17392) : 00.83% 01.28% : RegionUsageStack
44000 ( 272) : 00.01% 00.02% : RegionUsageTeb
1585a000 ( 352616) : 16.81% 26.04% : RegionUsageHeap
0 ( 0) : 00.00% 00.00% : RegionUsagePageHeap
1000 ( 4) : 00.00% 00.00% : RegionUsagePeb
1000 ( 4) : 00.00% 00.00% : RegionUsageProcessParametrs
1000 ( 4) : 00.00% 00.00% : RegionUsageEnvironmentBlock
Tot: 7fff0000 (2097088 KB) Busy: 52a6c000 (1354160 KB)
-------------------- Type SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
2d584000 ( 742928) : 35.43% : <free>
154e3000 ( 349068) : 16.65% : MEM_IMAGE
127c000 ( 18928) : 00.90% : MEM_MAPPED
3c30d000 ( 986164) : 47.03% : MEM_PRIVATE
-------------------- State SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
3938b000 ( 937516) : 44.71% : MEM_COMMIT
2d584000 ( 742928) : 35.43% : MEM_FREE
196e1000 ( 416644) : 19.87% : MEM_RESERVE
Largest free region: Base 4b0b0000 - Size 017a0000 (24192 KB)
Seems that there is enough free memory (742928 KB) overall but the biggest free chunk has only 24192 KB. Again: Free Memory Fragmentation!
!threads
ThreadCount: 38
UnstartedThread: 0
BackgroundThread: 37
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
14 1 5358 0010f718 1808220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
18 2 61a4 001118d0 b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Finalizer)
19 3 6060 0012a3f8 80a220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Completion Port)
20 4 64c8 0012df90 1220 Enabled 00000000:00000000 000dd1b8 0 Ukn
12 5 57f4 00147e80 880a220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Completion Port)
23 7 6714 0eb89f08 180b220 Enabled 00000000:00000000 0012e6d0 1 MTA (Threadpool Worker)
24 8 66b8 0eb91970 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
25 b 6320 0eb942f0 180b220 Disabled 00000000:00000000 0012e6d0 0 MTA (Threadpool Worker)
26 d 2004 0eb97120 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
27 e 5bb0 0eb9a438 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
28 f 61a8 0eb9dee8 380b220 Enabled 00000000:00000000 0012e6d0 1 MTA (Threadpool Worker)
29 14 3b88 0ebba688 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
30 15 5d74 0ebc4840 380b220 Enabled 00000000:00000000 0012e6d0 1 MTA (Threadpool Worker)
31 16 422c 0ebc91b0 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
32 18 6544 125242c8 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
33 1a 4c68 12534bc8 180b220 Disabled 4e875ac4:4e875d30 0012e6d0 1 MTA (Threadpool Worker)
34 1b 66d4 12539c80 180b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Worker)
35 1c 5e8c 12542e58 180b220 Enabled 00000000:00000000 0012e6d0 1 MTA (Threadpool Worker)
36 1d 62f0 1254be90 180b220 Enabled 4e875d84:4e877d30 0012e6d0 2 MTA (Threadpool Worker) System.OutOfMemoryException (4e875d3c)
39 1e 6558 0ec16d28 80a220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Completion Port)
40 1f 6180 0ec14b70 200b020 Enabled 00000000:00000000 0012e6d0 0 MTA
43 20 592c 0ebd7a00 220 Enabled 00000000:00000000 000dd1b8 0 MTA
45 21 624c 1261a060 220 Enabled 00000000:00000000 000dd1b8 0 MTA
8 22 5c78 125499f8 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
6 23 3c68 126b6e90 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
7 24 6458 36414400 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
44 25 5e60 36675440 220 Enabled 00000000:00000000 000dd1b8 0 MTA
5 26 55d8 364214a0 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
57 27 6534 36622948 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
58 28 59bc 0016f810 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
56 29 3ee0 250fa6d8 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
60 2a 63fc 252da068 200b220 Enabled 00000000:00000000 0012e6d0 0 MTA
59 2b 5fdc 24fc0be8 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
61 2c 4154 25052008 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
62 2d 60fc 250093a8 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
42 2e 1a38 1c99b5d0 220 Enabled 00000000:00000000 000dd1b8 0 MTA
63 13 59e0 0ebb5d48 220 Enabled 00000000:00000000 000dd1b8 0 Ukn
64 19 6420 0ebac6a0 880b220 Enabled 00000000:00000000 000dd1b8 0 MTA (Threadpool Completion Port)
!eeheap -gc
Number of GC Heaps: 2
------------------------------
Heap 0 (000e2418)
generation 0 starts at 0x432d0c54
generation 1 starts at 0x432c0038
generation 2 starts at 0x02060038
ephemeral segment allocation context: none
segment begin allocated size
02060000 02060038 03d0234c 0x01ca2314(30024468)
432c0000 432c0038 441e82f4 0x00f282bc(15893180)
Large object heap starts at 0x0a060038
segment begin allocated size
0a060000 0a060038 0bd57530 0x01cf74f8(30373112)
5b920000 5b920038 5c9bf338 0x0109f300(17429248)
Heap Size 0x5960dc8(93720008)
------------------------------
Heap 1 (00110750)
generation 0 starts at 0x4e869d30
generation 1 starts at 0x4e820038
generation 2 starts at 0x06060038
ephemeral segment allocation context: none
segment begin allocated size
06060000 06060038 07a0af98 0x019aaf60(26914656)
4e820000 4e820038 4e8eaf38 0x000caf00(831232)
Large object heap starts at 0x0c060038
segment begin allocated size
0c060000 0c060038 0c9ec998 0x0098c960(10013024)
6e020000 6e020038 6f90f0a8 0x018ef070(26144880)
Heap Size 0x3cf1830(63903792)
------------------------------
GC Heap Size 0x96525f8(157623800)
!dumpheap (heap1 extract)
0a060038 000e1a98 16 Free
0a060048 793042f4 4096
0a061048 000e1a98 16 Free
0a061058 793042f4 528
0a061268 000e1a98 16 Free
0a061278 793042f4 4096
0a062278 000e1a98 16 Free
0a062288 793042f4 5112
0a063680 000e1a98 16 Free
0a063690 793042f4 4096
0a064690 000e1a98 16 Free
0a0646a0 793042f4 4096
0a0656a0 000e1a98 16 Free
0a0656b0 793042f4 5112
0a066aa8 000e1a98 16 Free
0a066ab8 793042f4 4096
0a067ab8 000e1a98 16 Free
0a067ac8 793042f4 4096
0a068ac8 000e1a98 16 Free
0a068ad8 793042f4 4096
0a069ad8 793042f4 528
0a069ce8 000e1a98 16 Free
0a069cf8 793042f4 528
0a069f08 793042f4 528
0a06a118 000e1a98 16 Free
0a06a128 793042f4 528
0a06a338 000e1a98 260096 Free
0a0a9b38 793042f4 4096
0a0aab38 000e1a98 16 Free
0a0aab48 793042f4 5784
0a0ac1e0 000e1a98 16 Free
0a0ac1f0 793042f4 4096
0a0ad1f0 000e1a98 16 Free
0a0ad200 793042f4 528
0a0ad410 000e1a98 16 Free
0a0ad420 793042f4 4096
0a0ae420 000e1a98 16 Free
0a0ae430 793042f4 528
0a0ae640 000e1a98 16 Free
0a0ae650 793042f4 4096
0a0af650 000e1a98 16 Free
0a0af660 793042f4 528
0a0af870 000e1a98 16 Free
0a0af880 793042f4 528
0a0afa90 000e1a98 131120 Free
0a0cfac0 793042f4 528
0a0cfcd0 000e1a98 16 Free
0a0cfce0 793042f4 4096
0a0d0ce0 000e1a98 16 Free
0a0d0cf0 793042f4 528
0a0d0f00 000e1a98 16 Free
0a0d0f10 793042f4 528
0a0d1120 000e1a98 16 Free
0a0d1130 793042f4 528
0a0d1340 000e1a98 16 Free
0a0d1350 793042f4 4096
0a0d2350 000e1a98 16 Free
0a0d2360 793042f4 5784
0a0d39f8 000e1a98 16 Free
0a0d3a08 793042f4 4096
0a0d4a08 000e1a98 348200 Free
0a129a30 793042f4 528
0a129c40 000e1a98 16 Free
0a129c50 793042f4 528
0a129e60 000e1a98 361224 Free
0a182168 793042f4 528
0a182378 000e1a98 16 Free
0a182388 793042f4 7016
0a183ef0 000e1a98 16 Free
0a183f00 793042f4 7016
...
63859d80 14762 413336 System.Xml.XmlElement
6385a090 12103 435708 System.Xml.XmlName
79332b54 21020 504480 System.Collections.ArrayList
6385798c 32932 658640 System.Xml.NameTable+Entry
6385c76c 35215 704300 System.Xml.XmlAttribute
79331754 505 706416 System.Char[]
7932dd5c 12751 714056 System.Reflection.RuntimePropertyInfo
6385a284 36665 733300 System.Xml.XmlText
79332cc0 5530 791644 System.Int32[]
7932fde0 22824 1278144 System.Reflection.RuntimeMethodInfo
79333274 6758 1733808 System.Collections.Hashtable+bucket[]
793042f4 54360 5051132 System.Object[]
79333594 4772 29304312 System.Byte[]
79330b24 225539 33121896 System.String
000e1a98 239 72089072 Free
Total 711343 objects
Fragmented blocks larger than 0.5 MB:
Addr Size Followed by 4331b1d8 14.8MB 441e8270 System.Threading.Overlapped
I looked inside some of the addresses between the "Free" segments but unfortunately I can't find any information about the source that coused the issue.
!do 0a182388
Name: System.Object[]
MethodTable: 793042f4
EEClass: 790eda64
Size: 7012(0x1b64) bytes
Array: Rank 1, Number of elements 1749, Type CLASS
Element Type: System.Object
Fields:
None
!gcroot 0a182388
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 14 OSTHread 5358
Scan Thread 18 OSTHread 61a4
Scan Thread 19 OSTHread 6060
Scan Thread 20 OSTHread 64c8
Scan Thread 12 OSTHread 57f4
Scan Thread 23 OSTHread 6714
Scan Thread 24 OSTHread 66b8
Scan Thread 25 OSTHread 6320
Scan Thread 26 OSTHread 2004
Scan Thread 27 OSTHread 5bb0
Scan Thread 28 OSTHread 61a8
Scan Thread 29 OSTHread 3b88
Scan Thread 30 OSTHread 5d74
Scan Thread 31 OSTHread 422c
Scan Thread 32 OSTHread 6544
Scan Thread 33 OSTHread 4c68
Scan Thread 34 OSTHread 66d4
Scan Thread 35 OSTHread 5e8c
Scan Thread 36 OSTHread 62f0
Scan Thread 39 OSTHread 6558
Scan Thread 40 OSTHread 6180
Scan Thread 43 OSTHread 592c
Scan Thread 45 OSTHread 624c
Scan Thread 8 OSTHread 5c78
Scan Thread 6 OSTHread 3c68
Scan Thread 7 OSTHread 6458
Scan Thread 44 OSTHread 5e60
Scan Thread 5 OSTHread 55d8
Scan Thread 57 OSTHread 6534
Scan Thread 58 OSTHread 59bc
Scan Thread 56 OSTHread 3ee0
Scan Thread 60 OSTHread 63fc
Scan Thread 59 OSTHread 5fdc
Scan Thread 61 OSTHread 4154
Scan Thread 62 OSTHread 60fc
Scan Thread 42 OSTHread 1a38
Scan Thread 63 OSTHread 59e0
Scan Thread 64 OSTHread 6420
DOMAIN(0012E6D0):HANDLE(Pinned):e4613d4:Root:0a182388(System.Object[])
!gcroot 0a129a30
...
Scan Thread 64 OSTHread 6420
DOMAIN(000DD1B8):HANDLE(Pinned):1fc11b8:Root:0a129a30(System.Object[])
!gcroot 0a061278
...
DOMAIN(000DD1B8):HANDLE(Pinned):1fb13f0:Root:0a061278(System.Object[])
!gchandles
GC Handle Statistics:
Strong Handles: 1007
Pinned Handles: 474
Async Pinned Handles: 6
Ref Count Handles: 5
Weak Long Handles: 681
Weak Short Handles: 56
Other Handles: 0
...
661485ec 68 2176 System.Web.NativeFileChangeNotification
66153774 93 2976 System.Web.Hosting.ISAPIAsyncCompletionCallback
793310f8 68 3808 System.Threading.Thread
793141f0 162 6480 System.Reflection.Emit.DynamicResolver
79332070 279 6696 System.Reflection.Assembly
7932f19c 228 10944 System.Reflection.Module
793327e8 328 11808 System.Security.PermissionSet
7932f25c 386 29336 System.RuntimeType+RuntimeTypeCache
793042f4 185 294456 System.Object[]
In the DebugDiag SharePoint Analysis some undisposed SPWeb objects were reported.
So trying to find the cause here...
Report: "undisposed SPWeb object 0x02701168 references a disposed or invalid SPRequest object: 0x0270137c"
!do 0x02701168
Name: Microsoft.SharePoint.SPWeb
MethodTable: 1325ed80
EEClass: 1669cd80
Size: 508(0x1fc) bytes
(C:\WINDOWS\assembly\GAC_MSIL\Microsoft.SharePoint\12.0.0.0__71e9bce111e9429c\Microsoft.SharePoint.dll)
!gcroot 0x02701168
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 14 OSTHread 5358
Scan Thread 18 OSTHread 61a4
Scan Thread 19 OSTHread 6060
Scan Thread 20 OSTHread 64c8
Scan Thread 12 OSTHread 57f4
Scan Thread 23 OSTHread 6714
ESP:efbe92c:Root:07a2cbc8(System.Collections.Hashtable+bucket[])->
023f1044(Microsoft.SharePoint.Publishing.CacheManager)->
023f3df4(Microsoft.SharePoint.Publishing.CachedObjectFactory)->
023f3e7c(Microsoft.SharePoint.Publishing.WssObjectCache)->
023f3f30(System.Collections.Hashtable)->
03b0f448(System.Collections.Hashtable+bucket[])->
0733b918(Microsoft.SharePoint.Publishing.ThreadSafeCache`2+CacheEntry`2[[System.String, mscorlib],[Microsoft.SharePoint.Publishing.CachedObjectWrapper, Microsoft.SharePoint.Publishing],[System.String, mscorlib],[Microsoft.SharePoint.Publishing.CachedObjectWrapper, Microsoft.SharePoint.Publishing]])->
0733b868(Microsoft.SharePoint.Publishing.CachedObjectWrapper)->
035f25b4(Microsoft.SharePoint.Publishing.CachedPage)->
035f2718(System.Collections.Generic.Dictionary`2[[Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapProvider, Microsoft.SharePoint.Publishing],[Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapNode, Microsoft.SharePoint.Publishing]])->
0733ba90(System.Collections.Generic.Dictionary`2+Entry[[Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapProvider, Microsoft.SharePoint.Publishing],[Microsoft.SharePoint.Publishing.Navigation.PortalSiteMapNode, Microsoft.SharePoint.Publishing]][])->
0733b92c(Microsoft.SharePoint.Publishing.Navigation.PortalListItemSiteMapNode)->
069b2f88(Microsoft.SharePoint.Publishing.Navigation.PortalWebSiteMapNode)->
069bfee4(System.Collections.Generic.Dictionary`2[[System.Guid, mscorlib],[Microsoft.SharePoint.Publishing.Navigation.ProxySiteMapNode, Microsoft.SharePoint.Publishing]])->
069d8fe8(System.Collections.Generic.Dictionary`2+Entry[[System.Guid, mscorlib],[Microsoft.SharePoint.Publishing.Navigation.ProxySiteMapNode, Microsoft.SharePoint.Publishing]][])->
069e0934(Microsoft.SharePoint.Publishing.Navigation.ProxySiteMapNode)->
069e04a8(Microsoft.SharePoint.Navigation.SPNavigationNode)->
069bfe28(Microsoft.SharePoint.Navigation.SPNavigation)->
069bf7dc(Microsoft.SharePoint.SPWeb)->
02700dcc(Microsoft.SharePoint.SPSite)->
02701364(System.Collections.Generic.List`1[[Microsoft.SharePoint.SPWeb, Microsoft.SharePoint]])->
The MS disposed checker didn't find any issues too.
So now I don't know how to proceed further to find the (custom) component that causes the memory fragmentation. I hope that someone you could give me some hints, tool suggestion or check list of components that may cause the fragmentation (Antivirus, caching etc). The problem ossurs only in the prod environment and the only thing that we do now is iisreset - sometime 5 times a day…
Thank you in advance and best regards,
Anton
Your crash logs might contain the faulting object but more likely the assembly that's executing is changing with every crash and seem random. They might just be the innocent bystander that got left holding the bag when all the memory was gone.
First - can't you configure the app pool to automatically recycle when a certain memory threshold is reached? This might help alleviate your need to constantly monitor and be ready for an IISRESET. Otherwise you might want to schedule regular recycles to keep the memory tidy for the time being.
Next, try to identify when the crashes began and check your deployment logs to see what was installed. (You DO keep logs of software package installs, right?)
Are the custom components developed in-house? You can punt some of the work initially by having the developers check all their projects which have been deployed with the SharePoint Dispose Checker Tool (is this what you were referring to at the end of your question?) Un-disposed SPWeb and SPSite objects seem to be the biggest cause of this fragmentation.
Another avenue to explore is this MSDN question I ran into while looking for something else. It appears the Navigation bar on a publishing page was to blame. There is a hotfix for that issue but you have to request it directly from Microsoft.
I've been developing for SharePoint for a long time but it's always been someone else's job to find these problems! These tidbits are what I've gleaned over time and hopefully something will be useful.

Resources