Failed to get scheduler location from state manager Heron Tutorial - heron

I'm working through the heron tutorial found here: https://apache.github.io/incubator-heron/docs/getting-started/'
I didn't get very far before I encountered this error:
$: heron activate local WindowedWordCountTopology
[2019-02-01 15:55:11 +0000] [INFO]: Using cluster definition in /home/<my-user>/.heron/conf/local
[2019-02-01 15:55:11 +0000] [ERROR]: Failed to get scheduler location from state manager
[2019-02-01 15:55:11 +0000] [ERROR]: Failed to activate topology: WindowedWordCountTopology
I'm very new to Heron. Any idea what could be causing this?

This is not really an answer. It is too long for comment, so I have to create an answer....
Hard to tell the issue so far, this is the expected output with almost latest code fyi.
$ ~/.heron/bin/heron submit local ~/.heron/examples/heron-streamlet-examples.jar org.apache.heron.examples.streamlet.WindowedWordCountTopology WindowedWordCountTopology --deploy-deactivated
[2019-02-05 15:09:53 -0800] [INFO]: Using cluster definition in /Users/<user>/.heron/conf/local
Feb 05, 2019 3:09:54 PM org.apache.heron.streamlet.impl.StreamletImpl defaultNameCalculator
INFO: Calculated stage Name as consumer1
[2019-02-05 15:09:54 -0800] [INFO]: Launching topology: 'WindowedWordCountTopology'
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Initalizing RoundRobinPacking. CPU default: 1.000000, RAM default: ByteAmount{1.0 GB (1073741824 bytes)}, DISK default: ByteAmount{1.0 GB (1073741824 bytes)}, RAM padding: ByteAmount{2.0 GB (2147483648 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal: container CPU hint: 3.000, RAM hint: ByteAmount{-1 bytes}, disk hint: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#1 CPU: 3.000000, RAM: ByteAmount{4.0 GB (4294967296 bytes)}, disk: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#2 CPU: 3.000000, RAM: ByteAmount{4.0 GB (4294967296 bytes)}, disk: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Initalizing RoundRobinPacking. CPU default: 1.000000, RAM default: ByteAmount{1.0 GB (1073741824 bytes)}, DISK default: ByteAmount{1.0 GB (1073741824 bytes)}, RAM padding: ByteAmount{2.0 GB (2147483648 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal: container CPU hint: 3.000, RAM hint: ByteAmount{-1 bytes}, disk hint: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#1 CPU: 3.000000, RAM: ByteAmount{4.0 GB (4294967296 bytes)}, disk: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO] org.apache.heron.packing.roundrobin.RoundRobinPacking: Pack internal finalized: container#2 CPU: 3.000000, RAM: ByteAmount{4.0 GB (4294967296 bytes)}, disk: ByteAmount{14.0 GB (15032385536 bytes)}.
[2019-02-05 15:09:55 -0800] [INFO]: Successfully launched topology 'WindowedWordCountTopology'
$ ~/.heron/bin/heron activate local WindowedWordCountTopology[2019-02-05 15:10:08 -0800] [INFO]: Using cluster definition in /Users/<user>/.heron/conf/local
[2019-02-05 15:10:09 -0800] [INFO] org.apache.heron.spi.utils.TMasterUtils: Topology command ACTIVATE completed successfully.
[2019-02-05 15:10:09 -0800] [INFO]: Successfully activate topology: WindowedWordCountTopology
For "local", the topology state data is stored in ~/.herondata/repository/state/local//. In your case, you may check if this file is created and contains correct information: ~/.herondata/repository/state/local/schedulers/WindowedWordCountTopology

Related

Disk space issue on docker container

We have deployed jenkins on docker container and recently we started seeing that our jenkins server is not coming up due to disk space issue. Below is the error we see in logs.
2022-09-17 21:41:32.567+0000 [id=32] INFO hudson.slaves.SlaveComputer#tryReconnect: Attempting to reconnect V3LOCITY-SLAVE-02
/usr/local/bin/jenkins.sh: line 38: cannot create temp file for here-document: No space left on device
Running from: /usr/share/jenkins/jenkins.war
webroot: EnvVars.masterEnvVars.get("JENKINS_HOME")
Exception in thread "main" java.io.IOException: Jenkins has failed to create a temporary file in /tmp
at Main.extractFromJar(Main.java:498)
at Main._main(Main.java:310)
at Main.main(Main.java:151)
Caused by: java.io.IOException: No space left on device
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at Main.extractFromJar(Main.java:495)
... 2 more
We assume issue with docker container running of out space, See below info for your reference.
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 1 1 572.5MB 0B (0%)
Containers 1 0 9.467GB 9.467GB (100%)
Local Volumes 0 0 0B 0B
Build Cache 0 0 0B 0B
Assuming container running of space we have increased it to 40 GB by adding below content in /etc/docker/daemon.json file and recreated the contained but still see the same issue after restart of container
{
"storage-driver": "devicemapper",
"storage-opts": [
"dm.basesize=40G"
]
}
See below docker info your reference.
Client:
Debug Mode: false
Server:
Containers: 1
Running: 0
Paused: 0
Stopped: 1
Images: 1
Server Version: 19.03.11-ol
Storage Driver: devicemapper
Pool Name: docker-249:0-1140851221-pool
Pool Blocksize: 65.54kB
Base Device Size: 42.95GB
Backing Filesystem: xfs
Udev Sync Supported: true
Data file: /dev/loop0
Metadata file: /dev/loop1
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Data Space Used: 10.82GB
Data Space Total: 107.4GB
Data Space Available: 96.56GB
Metadata Space Used: 6.877MB
Metadata Space Total: 2.147GB
Metadata Space Available: 2.141GB
Thin Pool Minimum Free Space: 10.74GB
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.170-RHEL7 (2020-03-24)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7eba5930496d9bbe375fdf71603e610ad737d2b2
runc version: 52de29d
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.1.12-124.65.1.2.el7uek.x86_64
Operating System: Oracle Linux Server 7.9
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.56GiB
Name: vm-app-docker-jenkinsqa
ID: TAII:OWLM:Y3BU:65DC:A3SK:SSJQ:H6H2:BLA2:HQA5:ODCP:Y7S5:KCJ2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.
WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Registries:
You need to map jenkins home to an external folder (volume) and make sure the host has enough space.
See Jenkins docs for more details.
For example:
docker run --name jenkins -v /var/jenkins_home:/var/jenkins_home ...

Error "unknown queue: root.default" when spark-submitting to YARN

I am submitting a simple Pyspark wordcount job to a freshly built YARN cluster, via Airflow and the SparkSubmitOperator. The job hits YARN, I can see it in the ResourceManager UI, but fails with this error:
"Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default"
*User: root
Name: PySpark Wordcount
Application Type: SPARK
Application Tags:
YarnApplicationState: FAILED
Queue: root.default
FinalStatus Reported by AM: FAILED
Started: Fri Feb 21 08:01:25 +1100 2020
Elapsed: 0sec
Tracking URL: History
Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default*
The default.root queue certainly seems to be there:
*Application Queues
Legend:CapacityUsedUsed (over capacity)Max Capacity
.root 0.0% used
..Queue: default 0.0% used
'default' Queue Status
Queue State: RUNNING
Used Capacity: 0.0%
Configured Capacity: 100.0%
Configured Max Capacity: 100.0%
Absolute Used Capacity: 0.0%
Absolute Configured Capacity: 100.0%
Absolute Configured Max Capacity: 100.0%
Used Resources: <memory:0, vCores:0>
Num Schedulable Applications: 0
Num Non-Schedulable Applications: 0
Num Containers: 0
Max Applications: 10000
Max Applications Per User: 10000
Max Application Master Resources: <memory:3072, vCores:1>
Used Application Master Resources: <memory:0, vCores:0>
Max Application Master Resources Per User: <memory:3072, vCores:1>
Configured Minimum User Limit Percent: 100%
Configured User Limit Factor: 1.0
Accessible Node Labels: *
Preemption: disabled*
What am I missing here ? Thanks
Submit with queue name default.
The root in the Resource Manager is used only to group the queues in hierarchical form.

TCP/UDP packets not reaching docker container

My host machine OS is OEL7 with kernel
Linux ispaaaems1 3.10.0-123.el7.x86_64 #1 SMP Wed Jul 9 18:59:11 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
And my docker info is
Containers: 4
Images: 124
Storage Driver: devicemapper
Pool Name: docker-253:0-88356-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 7.43 GB
Data Space Total: 107.4 GB
Data Space Available: 99.94 GB
Metadata Space Used: 9.302 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.138 GB
Udev Sync Supported: true
Library Version: 1.02.107-RHEL7 (2015-12-01)
Execution Driver: native-0.2
Kernel Version: 3.10.0-123.el7.x86_64
Operating System: Oracle Linux Server 7.2
CPUs: 2
Total Memory: 7.641 GiB
Name: ispaaaems1
ID: 6MUK:HS3D:OQTS:QMWY:WCKE:AZT6:COJP:F7EA:RPNX:7RHY:TKFB:D4LT
I am running a docker container with OS OEL6.6. I am sending a radius request at 1812-1813. All the packets are reaching the host machine, but few packets (3 out of 5) are getting dropped (not reaching inside the container).
Any help will be appreciated. Thanks in adavance.

Error when building a Docker container

The command docker build fails with error :
Error getting container f43128eda488c88a3b2e111aafb30b80a44faaead33bcf02f8bffd7ae1832753 from driver devicemapper: Error mounting '/dev/mapper/docker-8:2-41159178-f43128eda488c88a3b2e111aafb30b80a44faaead33bcf02f8bffd7ae1832753' on '/var/lib/docker/devicemapper/mnt/f43128eda488c88a3b2e111aafb30b80a44faaead33bcf02f8bffd7ae1832753': no such file or directory
docker info
Containers: 7
Images: 148
Storage Driver: devicemapper
Pool Name: docker-8:2-41159178-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 27.96 GB
Data Space Total: 107.4 GB
Data Space Available: 79.42 GB
Metadata Space Used: 19.27 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.128 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.77 (2012-10-15)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.14.27-xxxx-grs-ipv6-64
Operating System: Ubuntu 14.04.2 LTS
CPUs: 4
Total Memory: 15.66 GiB
Name: libra
ID: KYU6:KECQ:GGF3:QL4W:SB35:C3UX:54EY:NN3A:U4RF:SFIK:5ULD:2THZ
Username: porfus
Registry: https://index.docker.io/v1/
I use root-server OVH Kimsufi and Ubuntu 14.04.2 (3.14.27-xxxx-grs-ipv6-64 #1 SMP Wed Dec 17 14:02:42 CET 2014 x86_64 x86_64 x86_64 GNU/Linux). I read the description of this bug on GitHub (https://github.com/docker/docker/issues/4036), but did not understand how to get rid of it.
Considering you have "Library Version: 1.02.77 (2012-10-15)", consider upgrading docker and that library to its latest version.
That would make sure all the fixes mentioned in issues/4036 are taken into account.

Docker run, no space left on device

[root#host ~]# docker run 9e7de9390856
Timestamp: 2015-06-15 22:20:58.8367035 +1000 AEST
Code: System error
Message: [/usr/bin/tar -xf /var/lib/docker/tmp/cde0f3a199597ac2e18e7efc7744c84a6c134adef31fb88b6982a8732f45efa5090033894/_tmp.tar -C /var/lib/docker/devicemapper/mnt/cde0f3a199597ac2e18e7efc7744c84a6c134adef31fb88b6982a8732f45efa5/rootfs/tmp .] failed: /usr/bin/tar: ./was/fixPack/7.0.0-WS-WASSDK-LinuxX64-FP0000027.pak: Wrote only 4608 of 10240 bytes
/usr/bin/tar: ./was/fixPack/wasFixPackInstallResponseFile: Cannot write: No space left on device
.
.
Cannot write: No spaFATA[0141] Error response from daemon: : exit status 2
df -h:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda2 6.0G 3.2G 2.9G 52% /
devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.8G 0 1.8G 0% /dev/shm
tmpfs 1.8G 17M 1.8G 1% /run
tmpfs 1.8G 0 1.8G 0% /sys/fs/cgroup
/dev/xvdb1 99G 28G 67G 30% /var/lib/docker
docker info:
Containers: 2
Images: 34
Storage Driver: devicemapper
Pool Name: docker-202:17-2621441-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: extfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 15.89 GB
Data Space Total: 107.4 GB
Data Space Available: 76.3 GB
Metadata Space Used: 10.27 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.137 GB
Udev Sync Supported: true
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.93-RHEL7 (2015-01-28)
Execution Driver: native-0.2
Kernel Version: 3.10.0-229.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.1 (Maipo)
CPUs: 2
Total Memory: 3.452 GiB
Name: ip-10-100-128-182.localdomain
ID: 4ZZZ:BSQD:GBKL:4Y3N:J6BL:47QE:3HMQ:GLMY:FPUK:CEPM:3EBP:ZU7G
Debug mode (server): true
Debug mode (client): false
Fds: 13
Goroutines: 18
System Time: Mon Jun 15 22:48:24 AEST 2015
EventsListeners: 0
Init SHA1: 836be3a369bfc6bd4cbd3ade1eedbafcc1ea05d0
Init Path: /usr/libexec/docker/dockerinit
Docker Root Dir: /var/lib/docker
uname -a:
Linux ip-10-100-128-182.localdomain 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
Anyone can help me?
Not sure this information is enough. But tried couple of solutions, nothing worked.
docker version:
Client version: 1.6.0
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 8aae715/1.6.0
OS/Arch (client): linux/amd64
Server version: 1.6.0
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 8aae715/1.6.0
OS/Arch (server): linux/amd64
[root#host ~]# service docker status -l
Redirecting to /bin/systemctl status -l docker.service
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled)
Active: active (running) since Tue 2015-06-16 00:31:46 AEST; 2min 2s ago
Docs: http://docs.docker.com
Main PID: 3306 (docker)
CGroup: /system.slice/docker.service
└─3306 /usr/bin/docker -d --storage-opt dm.basesize=30G --storage-opt dm.loopmetadatasize=4G
It sounds like you're trying to start a container from a 14GB image.
A Docker container, when using the devicemapper storage driver, only has 10GB of space available by default. You appear to be using the devicemapper driver, so this is probably the source of your problem.
This article discusses in detail the process you need to use to increase the amount of space available for container filesystems.
Filesystem-based drivers (like the overlay driver) to not have this same limitation (but they may of course suffer from other limitations).

Resources