Can't find documents with MediaWiki 1.21 with the Lucene-search extension - search

We're running MediaWiki 1.21 on Ubuntu 12.04.3 with the Lucene-search extension 2.1.3 (from its build.properties file).
I followed the instructions for a Single Host Setup (using ant to build the jar), and Setting Up Suggestions for the Search Box. Things seemed to be working just fine. However, new documents aren't being matched by the type-ahead search feature. Looking at the filesystem, I see that there are various items in the application's indexes directory:
$ cd /usr/local/search/lucene-search-2/indexes
$ ls -l
total 24
drwxr-xr-x 10 root root 4096 Aug 20 2013 import
drwxr-xr-x 7 root root 4096 Apr 14 06:42 index
drwxr-xr-x 2 root root 4096 Apr 14 06:41 search
drwxr-xr-x 9 root root 4096 Aug 20 2013 snapshot
drwxr-xr-x 2 root root 4096 Aug 20 2013 status
drwxr-xr-x 8 root root 4096 Aug 20 2013 update
We have a daily cron job that runs the Lucene-search build command, which dumps the wiki database as xml, and then modifies files in the import and snapshot folders. I noticed that the job reads from the search folder, which contains symbolic links to the update folder:
$ ls -l search/
total 24
lrwxrwxrwx 1 root root 70 Feb 12 21:39 wikidb -> /usr/local/search/lucene-search-2/indexes/update/wikidb/20140212064727
lrwxrwxrwx 1 root root 73 Feb 12 21:39 wikidb.hl -> /usr/local/search/lucene-search-2/indexes/update/wikidb.hl/20140212064727
lrwxrwxrwx 1 root root 76 Apr 14 06:41 wikidb.links -> /usr/local/search/lucene-search-2/indexes/update/wikidb.links/20140414064150
lrwxrwxrwx 1 root root 77 Feb 12 21:39 wikidb.prefix -> /usr/local/search/lucene-search-2/indexes/update/wikidb.prefix/20140212064728
lrwxrwxrwx 1 root root 78 Feb 12 21:39 wikidb.related -> /usr/local/search/lucene-search-2/indexes/update/wikidb.related/20140212064713
lrwxrwxrwx 1 root root 76 Feb 12 21:39 wikidb.spell -> /usr/local/search/lucene-search-2/indexes/update/wikidb.spell/20140212064740
Only the wikidb.links entry is current. The others are a couple of months old, which makes me think I missed something in how our daily cron task is setup. Here's the job:
#!/bin/sh
log=/var/log/lucene-search-2-cron.log
(
echo "Building wiki lucene-search indexes ..."
cd /usr/local/search/lucene-search-2
./build
echo "Stopping the lsearchd service..."
service lsearchd stop
# ok, so stopping the service apparently doesn't mean that the processes are gone, whack them manually
# See tip on using the "[x]yz" character class option so you don't need the additional "grep -v xyz":
# http://stackoverflow.com/questions/3510673/find-and-kill-a-process-in-one-line-using-bash-and-regex
echo "Killing any lucene-search processes that didn't terminate..."
kill -9 $(ps -ef | grep '[l]search' | awk '{print $2}')
echo "Starting the lsearchd service..."
service lsearchd start
) > $log 2>&1
And here's the service script /etc/init.d/lsearchd:
#!/bin/sh -e
### BEGIN INIT INFO
# Provides: lsearchd
# Required-Start: $syslog
# Required-Stop: $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 1
# Short-Description: Start the Lucene Search daemon
# Description: Provide a Lucene Search backend for MediaWiki. Copied by John Ericson from: http://ubuntuforums.org/showthread.php?t
=1476445
### END INIT INFO
# Set to install directory of lucense-search. For example: /usr/local/lucene-search-2.1.3
LUCENE_SEARCH_DIR="/usr/local/search/lucene-search-2"
# Set username for daemon to run as. Can also use syntax "username:groupname" to also specify group for daemon to run as. For example: me:me
RUN_AS_USER="lsearchd"
OPTIONS="-configfile $LUCENE_SEARCH_DIR/lsearch.conf"
test -x $LUCENE_SEARCH_DIR/lsearchd || exit 0
test -n "$RUN_AS_USER" && CHUID_ARG="--chuid $RUN_AS_USER" || CHUID_ARG=""
if [ -f "/etc/default/lsearchd" ] ; then
. /etc/default/lsearchd
fi
. /lib/lsb/init-functions
case "$1" in
start)
cd $LUCENE_SEARCH_DIR
log_begin_msg "Starting Lucene Search Daemon..."
start-stop-daemon --start --quiet --oknodo --chdir $LUCENE_SEARCH_DIR --background $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd -- $OPT
IONS
log_end_msg $?
;;
stop)
log_begin_msg "Stopping Lucene Search Daemon..."
start-stop-daemon --stop --quiet --oknodo --retry 2 --chdir $LUCENE_SEARCH_DIR $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd
log_end_msg $?
;;
restart)
$0 stop
sleep 1
$0 start
;;
reload|force-reload)
log_begin_msg "Reloading Lucene Search Daemon..."
start-stop-daemon --stop -signal 1 --chdir $LUCENE_SEARCH_DIR $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd
log_end_msg $?
;;
status)
status_of_proc $LUCENE_SEARCH_DIR/lsearchd lsearchd && exit 0 || exit $?
;;
*)
log_success_msg "Usage: /etc/init.d/lsearchd {start|stop|restart|reload|force-reload|status}"
exit 1
esac
exit 0
Update #1:
I deleted the update directory and ran the build command manually from the console as root. As expected, it only generated the update/wikidb.links entry, none of the other folders exist. I reviewed my earlier setup notes, and don't see anything different, so how did those folders get created, and how do they get maintained?
Update #2:
I retraced my steps from the initial install, and couldn't see anything I missed. So on a chance, I stopped the service and ran lsearchd from the console, and it created the missing directories! So I terminated the process and tried things again: deleted the indexes folder and ran the cron script from the console as root. I confirmed that when run this way, lsearchd DID NOT create the missing directories. And of course, now I remember that I had run lsearchd from the console when initially setting things up, verifying that it was getting client queries for the wiki's Search input field. And these are the indexes it had been using for the lookups, which explains why new documents are not included.
Here is what the command looks like when run as a service:
$ ps -ef | grep [l]search
lsearchd 10192 1 0 14:02 ? 00:00:00 /bin/bash /usr/local/search/lucene-search-2/lsearchd -configfile /usr/local/search/lucene-search-2/lsearch.conf
lsearchd 10198 10192 0 14:02 ? 00:00:01 java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2/LuceneSearch.jar -Djava.rmi.server.hostname=AMWikiBugz -jar /usr/local/search/lucene-search-2/LuceneSearch.jar -configfile /usr/local/search/lucene-search-2/lsearch.conf
So the remaining question is:
Why does lsearchd NOT create the directories when run as a service?

This was a permissions issue. d'oh!
The cron job and service init scripts all execute as root, however the service process is instantiated as the lsearchd user. Once I changed ownership of /usr/local/search/lucene-search-2/indexes/ and all subdirectories to be owned by lsearchd:lsearchd, the lsearchd process was able to create the missing directories when run via the service under cron.
It would have helped if something along the way had logged an error message to syslog indicating that it couldn't write to the target folder.

Related

to access the scratch folder of a SLURM cluster node

I would appreciate your suggestions and advise on the following please :
I am using a SLURM cluster and my colleagues have advised to run a singularity container on the cluster, and re-direct the output of the singularity container to a folder that is hosted in the /scratch folder of each computing node.
for example :
singularity exec --bind /local/scratch/bt:/output \
singularity_latest.sif run \
-o /output
i would like to ask please : how can i access the "output" folder in the "scratch" of the computing node ? Thanks a lot !
bogdan
You can think of --bind as a bit like a symlink. Running ls /local/scratch/bt on the host OS is equivalent to running ls /output inside the exec process.
mkdir scratch
touch scratch/file1
ls -l scratch
# total 0
# -rw-rw-r-- 1 tsnowlan tsnowlan 0 Jun 8 09:13 file1
singularity exec -B $PWD/scratch:/output my_image.sif ls -l /output
# total 0
# -rw-rw-r-- 1 tsnowlan tsnowlan 0 Jun 8 09:13 file1
# singularity also accepts relative paths
singularity exec -B scratch:/output my_image.sif touch /output/file2
ls -l scratch
# total 0
# -rw-rw-r-- 1 tsnowlan tsnowlan 0 Jun 8 09:13 file1
# -rw-rw-r-- 1 tsnowlan tsnowlan 0 Jun 8 09:16 file2

Cgroup unexpectedly propagates SIGSTOP to the parent

I have a small script to run a command inside a cgroup that limits CPU time:
$ cat cgrun.sh
#!/bin/bash
if [[ $# -lt 1 ]]; then
echo "Usage: $0 <bin>"
exit 1
fi
sudo cgcreate -g cpu:/cpulimit
sudo cgset -r cpu.cfs_period_us=1000000 cpulimit
sudo cgset -r cpu.cfs_quota_us=100000 cpulimit
sudo cgexec -g cpu:cpulimit sudo -u $USER "$#"
sudo cgdelete cpu:/cpulimit
I let the command run: ./cgrun.sh /bin/sleep 10
Then I send SIGSTOP to the sleep command from another terminal. Somehow at this moment the parent commands, sudo and cgexec receive this signal as well. Then, I send SIGCONT to the sleep command, which allows sleep to continue.
But at this moment sudo and cgexec are stopped and never reap the zombie of the sleep process. I don't understand how this can happen? And how can I prevent it? Moreover, I cannot send SIGCONT to sudo and cgexec, because I'm sending the signals from user, while these commands run as root.
Here is how it looks in htop (some columns omitted):
PID USER S CPU% MEM% TIME+ Command
1222869 user S 0.0 0.0 0:00.00 │ │ └─ /bin/bash ./cgrun.sh /bin/sleep 10
1222882 root T 0.0 0.0 0:00.00 │ │ └─ sudo cgexec -g cpu:cpulimit sudo -u user /bin/sleep 10
1222884 root T 0.0 0.0 0:00.00 │ │ └─ sudo -u desertfox /bin/sleep 10
1222887 user Z 0.0 0.0 0:00.00 │ │ └─ /bin/sleep 10
How can create a cgroup in a way that SIGSTOP is not bounced to parent processes?
UPD
If I start the process using systemd-run, I do not observe the same behavior:
sudo systemd-run --uid=$USER -t -p CPUQuota=10% sleep 10
Instead of using the "cg tools", I would do it the "hard way" with the shell commands to create the cpulimit cgroup (it is a mkdir), set the cfs parameters (with echo command in the corresponding cpu.cfs_* files), create a sub-shell with the (...) notation, move it into the cgroup (echo command of its pid into the tasks file of the cgroup) and execute the requested command in this subshell.
Hence, cgrun.sh would look like this:
#!/bin/bash
if [[ $# -lt 1 ]]; then
echo "Usage: $0 <bin>" >&2
exit 1
fi
CGTREE=/sys/fs/cgroup/cpu
sudo -s <<EOF
[ ! -d ${CGTREE}/cpulimit ] && mkdir ${CGTREE}/cpulimit
echo 1000000 > ${CGTREE}/cpulimit/cpu.cfs_period_us
echo 100000 > ${CGTREE}/cpulimit/cpu.cfs_quota_us
EOF
# Sub-shell in background
(
# Pid of the current sub-shell
# ($$ would return the pid of the father process)
MY_PID=$BASHPID
# Move current process into the cgroup
sudo sh -c "echo ${MY_PID} > ${CGTREE}/cpulimit/tasks"
# Run the command with calling user id (it inherits the cgroup)
exec "$#"
) &
# Wait for the sub-shell
wait $!
# Exit code of the sub-shell
rc=$?
# Delete the cgroup
sudo rmdir ${CGTREE}/cpulimit
# Exit with the return code of the sub-shell
exit $rc
Run it (before we get the pid of the current shell to display the process hierarchy in another terminal):
$ echo $$
112588
$ ./cgrun.sh /bin/sleep 50
This creates the following process hierarchy:
$ pstree -p 112588
bash(112588)-+-cgrun.sh(113079)---sleep(113086)
Stop the sleep process:
$ kill -STOP 113086
Look at the cgroup to verify that sleep command is running into it (its pid is in the tasks file) and the CFS parameters are correctly set:
$ ls -l /sys/fs/cgroup/cpu/cpulimit/
total 0
-rw-r--r-- 1 root root 0 nov. 5 22:38 cgroup.clone_children
-rw-r--r-- 1 root root 0 nov. 5 22:38 cgroup.procs
-rw-r--r-- 1 root root 0 nov. 5 22:36 cpu.cfs_period_us
-rw-r--r-- 1 root root 0 nov. 5 22:36 cpu.cfs_quota_us
-rw-r--r-- 1 root root 0 nov. 5 22:38 cpu.shares
-r--r--r-- 1 root root 0 nov. 5 22:38 cpu.stat
-rw-r--r-- 1 root root 0 nov. 5 22:38 cpu.uclamp.max
-rw-r--r-- 1 root root 0 nov. 5 22:38 cpu.uclamp.min
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.stat
-rw-r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_all
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_percpu
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_percpu_sys
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_percpu_user
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_sys
-r--r--r-- 1 root root 0 nov. 5 22:38 cpuacct.usage_user
-rw-r--r-- 1 root root 0 nov. 5 22:38 notify_on_release
-rw-r--r-- 1 root root 0 nov. 5 22:36 tasks
$ cat /sys/fs/cgroup/cpu/cpulimit/tasks
113086 # This is the pid of sleep
$ cat /sys/fs/cgroup/cpu/cpulimit/cpu.cfs_*
1000000
100000
Send SIGCONT signal to the sleep process:
$ kill -CONT 113086
The process finishes and the cgroup is destroyed:
$ ls -l /sys/fs/cgroup/cpu/cpulimit
ls: cannot access '/sys/fs/cgroup/cpu/cpulimit': No such file or directory
Get the exit code of the script once it is finished (it is the exit code of the launched command):
$ echo $?
0

Top Command Output is Empty when run from cron

I was trying to redirect the TOP command output in the particular file in every 5 minutes with the below command.
top -b -n 1 > /var/tmp/TOP_USAGE.csv.$(date +"%I-%M-%p_%d-%m-%Y")
-rw-r--r-- 1 root root 0 Dec 9 17:20 TOP_USAGE.csv.05-20-PM_09-12-2015
-rw-r--r-- 1 root root 0 Dec 9 17:25 TOP_USAGE.csv.05-25-PM_09-12-2015
-rw-r--r-- 1 root root 0 Dec 9 17:30 TOP_USAGE.csv.05-30-PM_09-12-2015
-rw-r--r-- 1 root root 0 Dec 9 17:35 TOP_USAGE.csv.05-35-PM_09-12-2015
Hence i made a very small (1 line) shell script for this, so that i can run in every 5 minutes via cronjob.
Problem is when i run this script manually then i can see the output in the file, however when this script in running automatically, file is generating in every 5 minutes but there is no data (aka file is empty)
Can anyone please help me on this?
I now modified the script and still it's the same.
#!/bin/sh
PATH=$(/usr/bin/getconf PATH)
/usr/bin/top -b -n 1 > /var/tmp/TOP_USAGE.csv.$(date +"%I-%M-%p_%d-%m-%Y")
I met the same problem as you.
Top command with -b option must be added.Saving top output to variable before we use it.
the scripts are below
date >> /tmp/mysql-mem-moniter.log
MEM=/usr/bin/top -b -n 1 -u mysql
echo "$MEM" | grep mysql >> /tmp/mysql-mem-moniter.log
Most likely the environment passed to your script from cron is too minimal. In particular, PATH may not be what you think it is (no profiles are read by scripts started from cron).
Place PATH=$(/usr/bin/getconf PATH) at the start of your script, then run it with
/usr/bin/env -i /path/to/script
Once that works without error, it's ready for cron.

Certain binaries run while others don't (despite ls visibility and +x) from mod_perl2 script

On Apache 2.2 on CentOS 6.4 with perl 5.10.1.
I'm trying to get a remote directory listing from within a mod_perl script, which apparently (if I die qx(id)) is running as apache. But I'm not even getting as far as being able to run ssh without parameters, just to have it print its help info. So that's what I'm asking how to do in this question--it's not about ssh not being able to connect.
die qx(which ssh);
dies with:
/usr/bin/ssh
and:
die qx(ls -al /usr/bin/ssh);
dies with:
-rwxr-xr-x. 1 root root 376920 Feb 21 2013 /usr/bin/ssh
Okay, so, it can find it, and see it, and has execute rights on it (which is true for /usr/bin and /usr as well.) But then:
die qx(/usr/bin/ssh); # or just 'ssh'
dies with an empty array, so I tried:
system("ssh") == 0 or die "failed: $! (code $?)"; # or '/usr/bin/ssh'
...which dies with:
No such file or directory (code 65280)
Why is this? How can I get die qx(ssh) or die qx(/usr/bin/ssh) to die with the expected value of:
usage: ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
[-D [bind_address:]port] [-e escape_char] [-F configfile]
[-i identity_file] [-L [bind_address:]port:host:hostport]
[-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port]
[-R [bind_address:]port:host:hostport] [-S ctl_path]
[-W host:port] [-w local_tun[:remote_tun]]
[user#]hostname [command]
Interestingly, from a bash prompt I get this:
$ sudo su apache ssh
This account is currently not available.
So...how can I run ls from an account that's not available, yet not ssh? They're both programs, why do they behave differently here?
Update: It's not just ssh, but I can't figure out the pattern: gawk, tar, and ping also don't work. Yet df, ls, dir, and pwd all do. But:
$ ls -al /bin/ls
-rwxr-xr-x. 1 root root 109208 May 23 07:00 /bin/ls
$ ls -al /usr/bin/dir
-rwxr-xr-x. 1 root root 109208 May 23 07:00 /usr/bin/dir
$ ls -al /bin/pwd
-rwxr-xr-x. 1 root root 28008 May 23 07:00 /bin/pwd
$ ls -al /bin/df
-rwxr-xr-x. 1 root root 73808 May 23 07:00 /bin/df
$ ls -al /bin/gawk
-rwxr-xr-x. 1 root root 375360 Aug 7 2012 /bin/gawk
$ ls -al /bin/tar
-rwxr-xr-x. 1 root root 390616 Feb 21 2013 /bin/tar
$ ls -al /usr/bin/ssh
-rwxr-xr-x. 1 root root 376920 Feb 21 2013 /usr/bin/ssh
$ ls -al /bin/ping
-rwsr-xr-x. 1 root root 40760 Jun 5 06:39 /bin/ping
So they all have all the 'x' bits set (except ping with its one 's', yet see below its error code), and for example, ssh and dir have identical ACLs. So why should ssh and gawk fail to give any output but dir and ls succeed? (Full paths or no.)
Update: even more perplexingly, /bin/gawk fails with the same message but code 256, and /bin/tar and /bin/ping similarly but code 512.
Update: OK, this part makes sense: If I run the failing binaries from the command line and then run echo $? immediately after, ssh gives 255, ping and tar give 2, and gawk gives 1. Those are scaled-down versions of what I get in mod_perl2. So, it seems to be that anything with a return code other than 0 doesn't work. Possibly it's outputting to STDERR, and so STDOUT doesn't capture anything, hence the blank return.
Aha, that's the answer--will post.
If you redirect STDERR to also capture it in your qx, like so:
die qx(ssh 2>&1);
...you'll get the output you get on the command line. So it's not that it's not running, it's just that it doesn't write anything to STDOUT.

Linux permissions issue on sftp server

Good day!
I have a linux sftp server located in VM. This VM has access to a GlusterFS storage, where sftp directories are located. Sftp works via OpenSSH server and chroots sftpusers group to sftp directories on GlusterFS storage. All worked well... After one moment I had got an issue...
Trying to create user:
# useradd -d /mnt/cluster-data/repositories/masters/test-user -G masters,sftpusers -m -s /bin/nologin test-user
Checking:
# cat /etc/passwd | grep test-user
test-user:x:1029:1032::/mnt/cluster-data/repositories/masters/test-user:/bin/nologin
# cat /etc/group | grep test-user
masters:x:1000:test-user
sftpusers:x:1005:test-user
test-user:x:1032:
Doing chown and chmod for home dir by hand:
# chown -R test-user:test-user /mnt/cluster-data/repositories/masters/test-user
# chmod -R 770 /mnt/cluster-data/repositories/masters/test-user
Checking:
# ls -la /mnt/cluster-data/repositories/masters/test-user
итого 16
drwxrwx--- 2 test-user test-user 4096 Окт 27 2013 .
drwxr-xr-x 13 root masters 4096 Окт 27 2013 ..
Adding another user to test-user's group:
# usermod -G test-user -a tarasov-af
# cat /etc/passwd | grep tarasov-af
tarasov-af:x:1028:1006::/mnt/cluster-data/repositories/lecturers/tarasov-af/:/bin/nologin
# cat /etc/group | grep tarasov-af
masters:x:1000:tarasov-af,test-user
sftpusers:x:1005:tarasov-af,test-user
lecturers:x:1006:tarasov-af
specialists:x:1008:tarasov-af
test-user:x:1032:tarasov-af
Login as tarasov-af:
sftp> cd masters/test-user
sftp> ls
remote readdir("/masters/test-user"): Permission denied
sftp> ls -la ..
drwxr-xr-x 13 0 1000 4096 Oct 26 21:30 .
drwxr-xr-x 6 0 0 4096 Oct 2 15:53 ..
drwxrwx--- 2 1029 1032 4096 Oct 26 21:53 test-user
I tried to login as tarasov-af into bash (usermod -s /bin/bash tarasov-af):
$ id
uid=1028 gid=1006
groups=1000,1005,1006,1008,1032
p.s. I guess this issue began after VM disk failed and I've got /etc/passwd and /etc/group broken, I've restored them from backups and all previous accounts works well, I have this issue only with new accounts.
I've found the reason of this issue: user tarasov-af has more than 16 secondary groups, first 15 groups work good, other -- don't work. I've set kernel.ngroups_max = 65535 in sysctl.conf on every computer in cluster (GlusterFS) and on sftp VM but nothing changed.
This issue goes to glusterfs client, it can't manipulate with more than 15 secondary groups.
# glusterfs --version
glusterfs 3.2.7 built on Sep 29 2013 03:28:05

Resources