How is number of young regions determined by G1? - garbage-collection

can anyone explain me how is the number of young regions determined by G1 collector? The G1 paper states that "we track the fixed and per-regions costs of fully young collections via historic averaging, and use these estimates to determine the number of young regions allocated between fully young evacuation pauses" but there is no detail.
For example, given the following GC log, what does predicted young region time mean and how is it calculated? How are eden and survivors calculated?
2015-12-30T08:04:21.124+0000: 10.424: [GC pause (young) 10.424: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 281, predicted base time: 17.05 ms, remaining time: 182.95 ms, target pause time: 200.00 ms]
10.424: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 155 regions, survivors: 5 regions, predicted young region time: 7335.19 ms]
10.424: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 155 regions, survivors: 5 regions, old: 0 regions, predicted pause time: 7352.24 ms, target pause time: 200.00 ms]
2015-12-30T08:04:21.171+0000: 10.472: [SoftReference, 0 refs, 0.0000490 secs]2015-12-30T08:04:21.172+0000: 10.472: [WeakReference, 617 refs, 0.0000780 secs]2015-12-30T08:04:21.172+0000: 10.472: [FinalReference, 360 refs, 0.0004730 secs]2015-12-30T08:04:21.172+0000: 10.472: [PhantomReference, 1 refs, 45 refs, 0.0000210 secs]2015-12-30T08:04:21.172+0000: 10.472: [JNI Weak Reference, 0.0000120 secs], 0.0494460 secs]
[Parallel Time: 47.1 ms, GC Workers: 13]
[GC Worker Start (ms): Min: 10424.3, Avg: 10424.4, Max: 10424.5, Diff: 0.2]
[Ext Root Scanning (ms): Min: 1.2, Avg: 1.6, Max: 2.0, Diff: 0.8, Sum: 21.2]
[Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Processed Buffers: Min: 0, Avg: 0.6, Max: 8, Diff: 8, Sum: 8]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.3, Diff: 0.2, Sum: 0.5]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.3, Diff: 0.3, Sum: 0.3]
[Object Copy (ms): Min: 44.4, Avg: 44.7, Max: 45.0, Diff: 0.6, Sum: 581.5]
[Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 3.0]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
[GC Worker Total (ms): Min: 46.5, Avg: 46.7, Max: 46.8, Diff: 0.2, Sum: 606.9]
[GC Worker End (ms): Min: 10471.1, Avg: 10471.1, Max: 10471.1, Diff: 0.1]
[Code Root Fixup: 0.1 ms]
[Code Root Migration: 0.3 ms]
[Clear CT: 0.5 ms]
[Other: 1.5 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 1.1 ms]
[Ref Enq: 0.0 ms]
[Free CSet: 0.3 ms]
[Eden: 4960.0M(4960.0M)->0.0B(4800.0M) Survivors: 160.0M->320.0M Heap: 5142.0M(100.0G)->345.4M(100.0G)]
[Times: user=0.60 sys=0.01, real=0.05 secs]

Related

I am getting ValueError: The estimator Sequential should be a classifier. how can I solve it?

I am using VotingClassifier for 31 Pre-trained models, when I wanted to do voting using VotingClassifier. I got this error ValueError: The estimator Sequential should be a classifier. The code is as shown below:
estimators = [("EfficientNetB0_model",EfficientNetB0_model),("EfficientNetB1_model",EfficientNetB1_model),("DenseNet121_model",DenseNet121_model),
("DenseNet169_model",DenseNet169_model),("DenseNet201_model",DenseNet201_model),("EfficientNetB2_model",EfficientNetB2_model),
("EfficientNetB3_model",EfficientNetB3_model),("EfficientNetB4_model",EfficientNetB4_model),("EfficientNetB5_model",EfficientNetB5_model),
("EfficientNetB6_model",EfficientNetB6_model),("EfficientNetB7_model",EfficientNetB7_model),("EfficientNetV2B0_model",EfficientNetV2B0_model),
("EfficientNetV2B1_model",EfficientNetV2B1_model),("EfficientNetV2B2_model",EfficientNetV2B2_model),("EfficientNetV2B3_model",EfficientNetV2B3_model),
("EfficientNetV2L_model",EfficientNetV2L_model),("EfficientNetV2M_model",EfficientNetV2M_model),("EfficientNetV2S_model",EfficientNetV2S_model),
("InceptionResNetV2_model",InceptionResNetV2_model),("InceptionV3_model",InceptionV3_model),
("ResNet50_model",ResNet50_model),("ResNet50V2_model",ResNet50V2_model),("ResNet101_model",ResNet101_model),
("ResNet101V2_model",ResNet101V2_model),("ResNet152_model",ResNet152_model),("ResNet152V2_model",ResNet152V2_model),
("VGG16_model",VGG16_model),("VGG19_model",VGG19_model),("Xception_model",Xception_model),
("MobileNet_model",MobileNet_model),("MobileNetV2_model",MobileNetV2_model)]
weights = [0.2, 0.3, 0.0, 0.1, 0.0,
0.3, 0.2, 0.1, 0.0, 0.3,
0.1, 0.3, 0.3, 0.1, 0.0,
0.1, 0.2, 0.1, 0.1, 0.1,
0.4, 0.0, 0.2, 0.1, 0.4,
0.0, 0.0, 0.1, 0.1, 0.0, 0.0
]
ensemble = VotingClassifier(estimators, weights=weights, voting= 'soft')
ensemble._estimator_type = "classifier"
ensemble = ensemble.fit(X_train, y_train)
print(ensemble.predict(X_test))
Could you help me since I could not find any solution for that. Thank you
Is there any other ways to do voting ?

Causal Impact in Python giving error: exog contains inf or nans

I have the following dataset.
y X
0 70.0 10.0
1 59.0 10.0
2 40.0 10.0
3 56.0 10.0
4 46.0 10.0
5 65.0 10.0
6 60.0 10.0
7 45.0 10.0
8 55.0 555267.0
9 69.0 558056.0
10 64.0 176734.0
When I run the following code:
import pandas as pd
import numpy as np
from causalimpact import CausalImpact
y1 = [70.0, 59.0, 40.0, 56.0, 46.0, 65.0, 60.0, 45.0, 55.0, 69.0, 64.0]
X1 = [10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 5552675.0, 5580561.0, 1767342.0]
y = np.array(y1)
X = np.array(X1)
y[8:] += 5
data = pd.DataFrame({'y': y, 'X': X}, columns=['y', 'X'])
pre_period = [0, 7]
post_period = [8, 10]
ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
I get the error: exog contains inf or nans
Any resolution to this would be great.
The problem is caused by having too many identical values in your X1 array. If you change any of your 10.0 to a, say, 11.0, the problem disappears.

ValueError: could not convert string to float: 'ane'

print(xtest.head())
print("predicted as",myModel.predict(xtest))
output:-
age bp sg al su rbc pc ... rbcc htn dm cad appet pe ane
235 45.0 70.0 1.01 2.0 0.0 1.0 1.0 ... 4.8 0.0 0.0 1.0 1.0 0.0 1.0
[1 rows x 24 columns]
predicted as [[0.99633694]]
The xtest dataframe had a column named ane and the model is predicting well. But when I am giving the same input in form of dictionary as
di={'age': 59, 'bp': 70, 'sg': 1.01, 'al': 1.0, 'su': 3.0, 'rbc': 0.0, 'pc': 0.0, 'pcc': 0.0, 'ba': 0.0, 'bgr': 424.0, 'bu': 55.0, 'sc': 1.7, 'sod': 138.0, 'pot': 4.5, 'hemo': 12.0, 'pcv': 37.0, 'wbcc': 10200.0, 'rbcc': 4.1, 'htn': 1.0, 'dm': 1.0, 'cad': 1.0, 'appet': 1.0, 'pe': 0.0, 'ane': 1.0 }
b=pd.DataFrame(di.items())
b=b.T
x['ane'] = x['ane'].astype(float)
tensor = tf.convert_to_tensor(b, dtype=tf.float64)
print(myModel.predict((tensor)))
It's showing the following error:-
ValueError: could not convert string to float: 'ane'
In the training model, I did the same conversion and it worked well.
My colab notebook:-
https://colab.research.google.com/drive/1DomDo3adwRBQUFD0g8JVpF5jxC9HoegW
you should try this code I replaced smae code in colab also.
import pandas as pd
di={'age': 59, 'bp': 70, 'sg': 1.01, 'al': 1.0, 'su': 3.0, 'rbc': 0.0, 'pc': 0.0, 'pcc': 0.0, 'ba': 0.0, 'bgr': 424.0, 'bu': 55.0, 'sc': 1.7, 'sod': 138.0, 'pot': 4.5, 'hemo': 12.0, 'pcv': 37.0, 'wbcc': 10200.0, 'rbcc': 4.1, 'htn': 1.0, 'dm': 1.0, 'cad': 1.0, 'appet': 1.0, 'pe': 0.0, 'ane': 1.0 }
b=pd.DataFrame(list(di.items()),index=di)
b= b.drop(columns=0)
b=b.T
b['ane'] = b['ane'].astype(float)
tensor = tf.convert_to_tensor(b, dtype=tf.float32)
print(myModel.predict((tensor)))

I want to convert a dictionary to pandas dataFrame

di={'ind': 1, 'age': 59, 'bp': 70, 'sg': 1.01, 'al': 1.0, 'su': 3.0, 'rbc': 0.0, 'pc': 0.0, 'pcc': 0.0, 'ba': 0.0, 'bgr': 424.0, 'bu': 55.0, 'sc': 1.7, 'sod': 138.0, 'pot': 4.5, 'hemo': 12.0, 'pcv': 37.0, 'wbcc': 10200.0, 'rbcc': 4.1, 'htn': 1.0, 'dm': 1.0, 'cad': 1.0, 'appet': 1.0, 'pe': 0.0, 'ane': 1.0}
I'm having this dictionary, and I want to convert this to a pandas dataframe with 'ind': 1 as the index, 24 columns and 1 row.
These are the names of each column that I want to have in my df:-
d=['age', 'bp', 'sg','al', 'su', 'rbc', 'pc', 'pcc', 'ba', 'bgr', 'bu', 'sc', 'sod', 'pot', 'hemo', 'pcv', 'wbcc', 'rbcc', 'htn', 'dm','cad', 'appet', 'pe', 'ane']
Please guide me with it. I tried the method pd.DataFrame(di.items(), columns=d) but it returned a df with 1 column and 24 rows, I wan the reciprocal of it i.e. 24 columns and 1 row.
Thank You
di={'ind': 1, 'age': 59, 'bp': 70, 'sg': 1.01, 'al': 1.0, 'su': 3.0, 'rbc': 0.0, 'pc': 0.0, 'pcc': 0.0, 'ba': 0.0, 'bgr': 424.0, 'bu': 55.0, 'sc': 1.7, 'sod': 138.0, 'pot': 4.5, 'hemo': 12.0, 'pcv': 37.0, 'wbcc': 10200.0, 'rbcc': 4.1, 'htn': 1.0, 'dm': 1.0, 'cad': 1.0, 'appet': 1.0, 'pe': 0.0, 'ane': 1.0}
print( pd.DataFrame([di]).set_index('ind') )
Prints:
age bp sg al su rbc pc pcc ba bgr bu sc sod pot hemo pcv wbcc rbcc htn dm cad appet pe ane
ind
1 59 70 1.01 1.0 3.0 0.0 0.0 0.0 0.0 424.0 55.0 1.7 138.0 4.5 12.0 37.0 10200.0 4.1 1.0 1.0 1.0 1.0 0.0 1.0
You can try
df = pd.Series(di).to_frame(0).T.set_index('ind')

Summing specific columns based on a mapping

I have a series which contains a mapping:
serm = pd.Series(
data={'ARD1': 53, 'BUL1': 37,
'BUL2': 37, 'BSR1': 49, 'BTR1': 53, 'CR1': 53,
'CRR1': 53, 'CRE3': 53,'TAB1': 52, 'NEP1': 42, 'HAL1': 42})
which maps the asset id (the index) to an area (the value).
I have the the following dataframe where serm index is the columns names.
data=pd.DataFrame(data={'ARD1': {0: 4.0, 1: 2.0, 2: 2.0, 3: 3.0, 4: 2.0},
'BUL1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'BUL2': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'BSR1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'BTR1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'CR1': {0: 15.0, 1: 13.0, 2: 13.0, 3: 11.0, 4: 13.0},
'CRR1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'CRE3': {0: 8.0, 1: 10.0, 2: 9.0, 3: 10.0, 4: 11.0},
'TAB1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'NEP1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0},
'HAL1': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}})
I would like to sum the columns of data that fall in the same area, according to the mapping of serm. How can I achieve this (the more pandanoic the better)
Use Index.map with groupby per columns and aggregate sum:
df = data.groupby(data.columns.map(serm.get), axis=1).sum()
print (df)
37 42 49 52 53
0 0.0 0.0 0.0 0.0 27.0
1 0.0 0.0 0.0 0.0 25.0
2 0.0 0.0 0.0 0.0 24.0
3 0.0 0.0 0.0 0.0 24.0
4 0.0 0.0 0.0 0.0 26.0
Or assign columns back and use sum:
data.columns = data.columns.map(serm.get)
df = data.sum(level=0, axis=1)

Resources