Internet connection via SSH tunnel (CentOS 7)

1. Set proxy on server A (without internet) by modifying /etc/yum.conf:

proxy=socks5h://localhost:1080

2. Establish connection:

ssh -D 1080 YOUR_USER@YOUR_SERVER_WITH_FULL_WEB_ACCESS

3. Open another terminal window and login to server A. Test internet connection:

curl –socks5 127.0.0.1:1080 http://www.baidu.com

4. Setup proxy for conda and git:

  1) Edit .condarc:

proxy_servers:
  http: socks5://127.0.0.1:1080
  https: socks5://127.0.0.1:1080

  2) Edit .gitconfig:

[http]
    proxy = socks5://127.0.0.1:1080
[https]
    proxy = socks5://127.0.0.1:1080

CentOS 7 mirrorlist.centos.org no longer online

1. Replace “mirror.centos.org” to “vault.centos.org” in /etc/yum.repos.d/CentOS-Base.repo:

sed -i s/mirror.centos.org/vault.centos.org/g /etc/yum.repos.d/*.repo
sed -i s/^#.*baseurl=http/baseurl=http/g /etc/yum.repos.d/*.repo
sed -i s/^mirrorlist=http/#mirrorlist=http/g /etc/yum.repos.d/*.repo

2. Update yum:

yum update

Install environment modules

1. Download the source:

curl –socks5 127.0.0.1:1080 -LJO https://github.com/cea-hpc/modules/releases/download/v5.4.0/modules-5.4.0.tar.gz

2. Unzip:

tar xfz modules-5.4.0.tar.gz

3. Compile and install:

./configure –prefix=/usr/share/Modules –modulefilesdir=/usr/share/Modules/modulefiles
make
make install

4. Enable Modules at shell startup (soft link or copy):

ln -s PREFIX/init/profile.sh /etc/profile.d/modules.sh
ln -s PREFIX/init/profile.csh /etc/profile.d/modules.csh

5. Test: Create a module file from shell script:

module sh-to-mod bash example/source-script-in-modulefile/foo-1.2/foo-setup.sh arg1 > modulefiles/foo/1.2

Install CentOS 6 on a computing node on cluster # 10/11

1. Prepare CentOS installation media (see https://docs.centos.org/en-US/centos/install-guide/Making_Media_USB_Mac):

> diskutil list
> diskutil unmountDisk /dev/disknumber # number is the number of the disk
> sudo dd if=/path/to/image.iso of=/dev/rdisknumber bs=1m # /dev/rdisknumber is faster than /dev/disknumber
> Ctrl + t to check progress
> diskutil eject /dev/disknumber

2. Install software development (or server) version. Use the following for disk partition: /boot 500 MB ext3, swap 64 GB, the rest all goes to /

3. Disable selinux: edit /etc/selinux/config, replace SELINUX=enforcing to SELINUX=disabled

4. Setup IP address:

    1) edit /etc/sysconfig/network, /etc/sysconfig/network-scripts/ifcfg-em1, /etc/sysconfig/network-scripts/ifcfg-ib0

    2) edit /etc/hosts, /etc/hosts.equiv

    3) delete /root/.ssh/known_hosts on management node, copy /root/.ssh on management node to computing node

    4) restart network: /etc/init.d/network restart or service network restart

5. Create yum repository from CentOS iso file:

    1) mount the iso

> mount -o loop,ro centos6.iso /mnt/centos6-iso

    2) create os.repo under /etc/yum.repos.d/

[os]
name=os
baseurl=file:///mnt/centos6-iso
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6

    3) test

> yum list

6. Install IB firmware:

    Check the following pages first:

        – https://blog.csdn.net/sinat_36458870/article/details/116596019

        – https://blog.csdn.net/force_eagle/article/details/46728665

    1) install tcl-devel, tk-devel, libmnl-devel from yum

    2) install mft-4.14.2 (download from https://network.nvidia.com/products/adapter-software/firmware-tools)

    3) check HCA firmware: https://www.mellanox.com/support/firmware/identification

> [root@node3_ib ~]# lspci | grep Mell
> 03:00.0 Network controller [0207]: Mellanox Technologies MT27500 Family [ConnectX-3]
> [root@node3_ib ~]# flint -d 03:00.0 q
> Image type: FS2
> FW Version: 2.40.5048
> FW Release Date: 5.3.2017
> Product Version: 02.40.50.48
> Rom Info: type=PXE version=3.4.746 devid=4099
> Device ID: 4099
> …
> PSID: DEL0A20000018 <= this is the number we need

    4) Find firmware on Mellanox website: https://www.mellanox.com/support/firmware/firmware-downloads.

        – For Dell PSID: https://www.mellanox.com/support/firmware/dell?mtag=oem_firmware_download

    5) Check device

> mst start
> mst status

    6) Burn (reboot afterwards)

> flint -d /dev/mst/mt4099_pciconf0 -i fw-ConnectX3-xxxxx.bin burn

    7) Update firmware: https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed

> MLNX_OFED_LINUX-4.1-1.0.2.0-rhel6.5-x86_64/mlnxofedinstall

    8) Restart openibd

> /etc/init.d/openibd restart

    9) test if successful

> ofed_info | head
> ibstat

7. Setup SGE queue system (everything is available on IO node so that we just need to mount /home and /backup):

    1) create /backup directory

    2) change /etc/fstab and mount -a

    3) export SGE_ROOT=/home/jointforce

    4) cd /home/jointforce && ./install_execd

    5) copy /home/jointforce/default/common/settings* to /etc/profile.d/

8. Disable iptables (or use setup command of CentOS)

> service iptables save
> service iptables stop
> chkconfig iptables off

9. Setup NIS (or use setup command): http://cn.linux.vbird.org/linux_server/0430nis.php

    1) edit /etc/sysconfig/network

    2) edit /etc/yp.conf

    3) edit /etc/nsswitch.conf

10. Install easycluster

    1) copy /opt/easycluster from any other node

    2) add /opt/easycluster/background/easy_c in /etc/rc.local

    3) start easycluster

11. Finalize: copy other stuff to /etc/profile.d/