0

I borrowed the SGE system, but it doesn't work when I qsub.

#!/bin/bash
#PBS -l nodes=1:ppn=4
#PBS -N pix2pix
#PBS -o pix2pix.out
#PBS -e pix2pix.err
#PBS -q GPU_HIGH
#PBS -l walltime=72000:00:00

#cd $PBS_0_WORKDIR
cd /public/home/chensu/others/Guicai/pix2pix-tensorflow
source activate tensorflow

python /public/home/chensu/others/Guicai/pix2pix-tensorflow/pix2pix.py --mode train   --output_dir /public/home/chensu/others/Guicai/pix2pix-tensorflow/face2face-model  --max_epochs 2000 --input_dir /public/home/chensu/others/Guicai/pix2pix-tensorflow/photos/combined/train  --which_direction AtoB  --batch_size 4

there are many free queues.

qstat -q

server: admin1

Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----
FAT_HIGH           --      --       --      --    0   0  0   E R
high               --      --       --      --   10   0 10   E R
MIC_HIGH           --      --       --      --    0   0  0   E R
GPU_HIGH           --      --       --      --    0   0  0   E R
low                --      --       --      --    4   0 20   E R
batch              --      --       --      --    0   0 20   E R
middle             --      --       --      --    0   0 10   E R
                                               ----- -----
                                                  14     0

when I qsub my test.pbs with qsub test.pbs

MGMT1: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time


20414.MGMT1 chensu GPU_HIGH pix2pix -- 1 4 -- 72000:00: C --

Also there are no log, so I don't know what happened.

Any suggestions will be appreciated

Taufik_TF
  • 134
  • 3
  • 10
  • walltime seems high. As a sysadmin, I would not allow such a jobs to run via imposing limit on walltime. Try a lower walltime and see if your jobs runs. – Vince Apr 23 '18 at 19:17
  • Hi Vince, I can run the program in the login node, but after qsub, it shows that :python3: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by python3) python3: /lib64/libc.so.6: version `GLIBC_2.17' not found (required by python3) – Taufik_TF Apr 24 '18 at 14:18
  • Python 3.6.5 (default, Apr 19 2018, 03:19:34) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> print("hi") hi – Taufik_TF Apr 24 '18 at 14:20
  • Seems like the C libraries on the login node are different on the execution node. If this is issue then you likely need your sysadmin to fix this. – Vince Apr 24 '18 at 15:12
  • So if no no root permission, can't update python2 to python3 on the login node? – Taufik_TF Apr 25 '18 at 00:48
  • Issue does not appear related to python2 to python3. It is an issue with the standard C library. It appears that python 3 is present but the node lacks version 2.17 of glibc. This is a common problem, and best address by a sysadmin as it normally requires root access to update a C library system-wide. – Vince Apr 25 '18 at 18:14
  • Also possible duplicate of: https://stackoverflow.com/questions/33655731/error-while-importing-tensorflow-in-python2-7-in-ubuntu-12-04-glibc-2-17-not-f?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa – Vince Apr 25 '18 at 18:15
  • Possible duplicate of [Error while importing Tensorflow in python2.7 in Ubuntu 12.04. 'GLIBC\_2.17 not found'](https://stackoverflow.com/questions/33655731/error-while-importing-tensorflow-in-python2-7-in-ubuntu-12-04-glibc-2-17-not-f) – Vince Apr 25 '18 at 18:17

0 Answers0