250+ Shell Scripts, Advanced Bash environment & Utility Code Library.
Used heavily by all my GitHub repos, dozens of DockerHub builds(Dockerfiles) and 300+ CI builds.
Contains:
- Scripts for fast advanced systems administration including auto-populating required switches
- Scripts for CI builds, forming a drop-in framework of standard checks
- Advanced Bash environment -
.bashrc
+.bash.d/*.sh
- tonnes of advanced customizations, aliases and functions. See .bash.d/README.md - Advanced configuration files for common tools like vim, screen, tmux, installs the best sysadmin packages like those above plus AWS CLI, Azure CLI, GCloud SDK, jq and many others, adds dynamic Git and shell behaviour enhancements, colouring, functions, aliases and automatic pathing of many common installation locations for many major languages like Python, Perl, Ruby, NodeJS, Golang...
- Utility library used in many scripts here and sourced from other repos, using the 2 libraries
.bash.d/
- interactive library (huge)lib/
- scripting and CI library (heavily used by hundreds of scripts and builds)
For more advanced Systems Administration scripts in other languages, see the repos listed at the bottom of the page.
These scripts can be used straight from the git clone, but see setup benefits of make install
next.
Hari Sekhon
Cloud & Big Data Contractor, United Kingdom
(ex-Cloudera, former Hortonworks Consultant)
https://www.linkedin.com/in/harisekhon
make install
- Adds sourcing to
.bashrc
/.bash_profile
to automatically inherit all.bash.d/*.sh
environment enhancements for all technologies (see Inventory Overview below) - Symlinks all
.*
config files to$HOME
for git, vim, top, htop, screen, tmux, editorconfig, Ansible etc. - Installs OS package dependencies for all scripts (detects the OS and installs the right RPMs, Debs, Apk or Mac HomeBrew packages)
- Installs Python packages including AWS CLI
make install
effectively does make system-packages bash python aws
, but if you want to pick and choose from different sections, see Individual Setup Parts below.
-
Scripts - Linux / Mac systems administration scripts:
- installation scripts for various OS packages (RPM, Deb, Apk) for various Linux distros (Redhat RHEL / CentOS / Fedora, Debian / Ubuntu, Alpine)
- install if absent scripts for Python, Perl, Ruby, NodeJS and Golang packages - good for minimizing the number of source code installs by first running the OS install scripts and then only building modules which aren't already detected as installed (provided by system packages), speeding up builds and reducing the likelihood of compile failures
- install scripts for Jython and build tools like Gradle and SBT for when Linux distros don't provide packaged versions or where the packaged versions are too old
- Git branch management
- utility scripts used from other scripts
-
.*
- dot conf files for lots of common software eg. advanced.vimrc
,.gitconfig
, massive.gitignore
,.editorconfig
,.screenrc
,.tmux.conf
etc..vimrc
- contains many awesome vim tweaks, plus hotkeys for linting lots of different file types in place, including Python, Perl, Bash / Shell, Dockerfiles, JSON, YAML, XML, CSV, INI / Properties files, LDAP LDIF etc without leaving the editor!.screenrc
- fancy screen configuration including advanced colour bar, large history, hotkey reloading, auto-blanking etc..tmux.conf
- fancy tmux configuration include advanced colour bar and plugins, settings, hotkey reloading etc.- Git:
.gitconfig
- advanced Git configuration.gitignore
- extensive Git ignore of trivial files you shouldn't commit- enhanced Git diffs
- protections against committing AWS access keys & secrets keys, merge conflict unresolved files
-
.bashrc
- shell tuning and sourcing of.bash.d/*.sh
-
.bash.d/*.sh
- thousands of lines of advanced bashrc code, aliases, functions and environment variables for:- Linux & Mac
- SCM - Git, Mercurial, Svn
- AWS
- GCP
- Docker
- Kubernetes
- Kafka
- Vagrant
- automatic GPG and SSH agent handling for handling encrypted private keys without re-entering passwords, and lazy evaluation to only prompt key load the first time SSH is called
- and lots more - see .bash.d/README for a more detailed list
- run
make bash
to link.bashrc
/.bash_profile
and the.*
dot config files to your$HOME
directory to auto-inherit everything
-
lib/*.sh
- Bash utility libraries full of functions for Docker, environment, CI detection (Travis CI, Jenkins etc), port and HTTP url availability content checks etc. Sourced from all my other GitHub repos to make setting up Dockerized tests easier. -
setup/install_*.sh
- various simple to use installation scripts for common technologies like AWS CLI, Azure CLI, GCloud SDK, Terraform, Ansible, MiniKube, MiniShift (Kubernetes / Redhat OpenShift/OKD dev VMs), Maven, Gradle, SBT, EPEL, RPMforge, Homebrew, Travis CI, Circle CI, AppVeyor, BuildKite, Parquet Tools etc. -
sql/*.sql
- SQL scripts eg. AWS Athena CloudTrail logs integration setup, Google BigQuery billing queries -
aws*.sh
- various AWS scripts for EC2 metadata, Spot Termination, SSM Parameter Store secret put from prompt, IAM Credential Reports on IAM users without MFA, old access keys and passwords, old user accounts that haven't logged in or used an access key recently, show password policy / set hardened password policy, show unattached IAM policies, show account summary to check various details including root account MFA enabled and no access keys, KMS keys rotation status, CloudTrail & Config status etc. -
gce*.sh
- Google Cloud scripts for GCE metadata API and pre-emption -
curl_auth.sh
- wraps curl to send your username and password from environment variables or interactive prompt through a ram file descriptor to avoid using the-u
/--user
which might otherwise expose your credentials in the process list or OS audit log files. Used by other API querying scripts -
k8s_api.sh
- finds Kubernetes API and runs your curl arguments against it, auto-getting authorization token and populatingAuthorization: Bearer
header -
ldapsearch.sh
- wraps ldapsearch inferring settings from environment, can use environment variables for overrides -
ldap_user_recurse.sh
/ldap_group_recurse.sh
- recurse Active Directory LDAP users upwards to find all parent groups, or groups downwards to find all nested users (useful for debugging LDAP integration and group-based permissions) -
kafka_*.sh
- scripts to make Kafka CLI usage easier including auto-setting Kerberos to source TGT from environment and auto-populating broker and zookeeper addresses. These are auto-added to the$PATH
when.bashrc
is sourced. For something similar for Solr, seesolr_cli.pl
in the DevOps Perl Tools repo. -
zookeeper_client.sh
- wraps zookeeper-client, auto-populating the zookeeper quorum from the environment variable$ZOOKEEPERS
or else parsing the zookeeper quorum from/etc/**/*-site.xml
to make it faster and easier to connect -
zookeeper_shell.sh
- wraps Kafka's zookeeper-shell, auto-populating the zookeeper quorum from the environment variable$KAFKA_ZOOKEEPERS
and optionally$KAFKA_ZOOKEEPER_ROOT
to make it faster and easier to connect -
beeline.sh
- connects to HiveServer2 via beeline, auto-populating Kerberos and SSL settings, zookeepers for HiveServer2 HA discovery if the environment variable$HIVE_HA
is set or using the$HIVESERVER_HOST
environment variable so you can connect with no arguments (prompts for HiveServer2 address if you haven't set$HIVESERVER_HOST
or$HIVE_HA
)beeline_zk.sh
- connects to HiveServer2 HA via beeline, auto-populating SSL and ZooKeeper service discovery settings (specify$HIVE_ZOOKEEPERS
environment variable to override). Automatically called bybeeline.sh
if either$HIVE_ZOOKEEPERS
or$HIVE_HA
is set (the latter parseshive-site.xml
for the ZooKeeper addresses)
-
hive_foreach_table.sh
- executes a SQL query against every table, replacing{db}
and{table}
in each iteration eg.select count(*) from {table}
-
hive_*.sh
- various scripts usingbeeline.sh
to list databases, tables, for all tables: row counts, DDL metadata field extraction, table locations etc. -
impala_shell.sh
- connects to Impala via impala-shell, parsing the Hadoop topology map and selecting a random datanode to connect to its Impalad. This is mostly for convenience to shorten commands and while it acts as a poor man's load balancer, you might want to instead use my real load balancer HAProxy config for Impala (and many other Big Data & NoSQL technologies). Optional environment variables$IMPALA_HOST
(eg. point to HAProxy load balancer) andIMPALA_SSL=1
(or use regular impala-shell--ssl
argument pass through) -
impala_foreach_table.sh
- executes a SQL query against every table, replacing{db}
and{table}
in each iteration eg.select count(*) from {table}
-
impala_*.sh
- various scripts usingimpala_shell.sh
to list databases, tables, for all tables: row counts, DDL metadata field extraction, table locations etc. -
mysql.sh
- connects to MySQL viamysql
, auto-populating settings from both standard environment variables like$MYSQL_TCP_PORT
,$DBI_USER
,$MYSQL_PWD
(see doc) and other common environment variables like$MYSQL_HOST
/$HOST
,$MYSQL_USER
/$USER
,$MYSQL_PASSWORD
/$PASSWORD
,$MYSQL_DATABASE
/$DATABASE
-
mysql_foreach_table.sh
- executes a SQL query against every table, replacing{db}
and{table}
in each iteration eg.select count(*) from {table}
-
mysql_*.sh
- various scripts usingmysql.sh
for row counts, iterating each table, or outputting clean lists of databases and tables for quick scripting -
psql.sh
- connects to PostreSQL viapsql
, auto-populating settings from environment variables, using both standard postgres supported environment variables like$PG*
(see doc) as well as other common environment variables like$POSTGRESQL_HOST
/$POSTGRES_HOST
/$HOST
,$POSTGRESQL_USER
/$POSTGRES_USER
/$USER
,$POSTGRESQL_PASSWORD
/$POSTGRES_PASSWORD
/$PASSWORD
,$POSTGRESQL_DATABASE
/$POSTGRES_DATABASE
/$DATABASE
-
postgres_foreach_table.sh
- executes a SQL query against every table, replacing{db}
,{schema}
and{table}
in each iteration eg.select count(*) from {table}
-
postgres_*.sh
- various scripts usingpsql.sh
for row counts, iterating each table, or outputting clean lists of databases, schemas and tables for quick scripting -
hdfs_checksum*.sh
- walks an HDFS directory tree and outputs HDFS native checksums, MD5-of-MD5 or the portable externally comparable CRC32, in serial or in parallel to save time -
hdfs_find_replication_factor_1.sh
/hdfs_set_replication_factor_3.sh
- finds HDFS files with replication factor 1 / sets HDFS files with replication factor <=2 to replication factor 3 to repair replication safety and avoid no replica alarms during maintenance operations (see also Python API version in the DevOps Python Tools repo) -
hdfs_file_size.sh
/hdfs_file_size_including_replicas.sh
- quickly differentiate HDFS files raw size vs total replicated size -
cloudera_manager_api.sh
- script to simplify querying Cloudera Manager API using environment variables, prompts, authentication and sensible defaults. Built on top ofcurl_auth.sh
-
cloudera_manager_impala_queries*.sh
- queries Cloudera Manager for recent Impala queries, failed queries, exceptions, DDL statements, metadata stale errors, metadata refresh calls etc. Built on top ofcloudera_manager_api.sh
-
cloudera_manager_yarn_apps.sh
- queries Cloudera Manager for recent Yarn apps. Built on top ofcloudera_manager_api.sh
-
cloudera_navigator_api.sh
- script to simplify querying Cloudera Navigator API using environment variables, prompts, authentication and sensible defaults. Built on top ofcurl_auth.sh
-
cloudera_navigator_audit_logs.sh
- fetches Cloudera Navigator audit logs for given service eg. hive/impala/hdfs via the API, simplifying date handling, authentication and common settings. Built on top ofcloudera_navigator_api.sh
-
cloudera_navigator_audit_logs_download.sh
- downloads Cloudera Navigator audit logs for each service by year. Skips existing logs, deletes partially downloaded logs on failure, generally retry safe (while true, Control-C, notkill -9
obviously). Built on top ofcloudera_navigator_audit_logs.sh
-
check_*.sh
- extensive collection of generalized tests that can be applied to any repo (these run against all my GitHub repos via CI) -
git*.sh
- various useful Git scripts, eg:git_foreach_branch.sh
- runs a command on all branches (useful in heavily version branched repos like in my Dockerfiles repo)git_foreach_repo.sh
- runs a command on all adjacent repos from a given repolist (useful for updating all your github projects across work and home computers)git_merge_all.sh
/git_merge_master.sh
/git_merge_master_pull.sh
- merges updates from master branch to all other branches to avoid drift on longer lived feature branches / version branches (eg. Dockerfiles repo)git_remotes_add_public_repos.sh
- auto-creates a checkout's remotes to the 3 major public repositories (GitHub/GitLab/Bitbucket)git_remotes_set_multi_origin.sh
- sets up multi-remote origin for unified push to all 3 major public repositoriesgit_repos_update.sh
- updates multiple repos based on a source file mapping list - useful for easily sync'ing lots of Git repos among computersgit_submodules_update_repos.sh
- submodule handling, including updating and committing latest submodule updates - used on all my repos for updating shared code submodules
-
github*.sh
- various useful GitHub scripts for querying the GitHub API including:github_api.sh
- querying the GitHub API while inferring github repo from local remotes and authenticating using$GITHUB_TOKEN
or token from git checkout's remote github url when available. Used as a base for several other scripts that use the GitHub API. Built on top ofcurl_auth.sh
github_get_user_ssh_public_key.sh
/github_get_user_ssh_public_key_api.sh
- fetches GitHub users public SSH keys for quick local installation to~/.ssh/authorized_keys
github_generate_status_page.sh
- generates a STATUS.md page by merging all the README.md headers for all a user's non-forked GitHub repos or a given list of any repos etc.
-
jenkins_cli.sh
- runs Jenkins CLI, auto-inferring basic configuations, auto-downloadsjenkins-cli.jar
from Jenkins server if not present, infers a bunch of Jenkins related variables like$JENKINS_URL
and authentication from$JENKINS_USER
/$JENKINS_PASSWORD
, or finds admin password from inside local docker container. Used heavily byjenkins.sh
one-shot setup -
jenkins_password.sh
- gets Jenkins admin password from local docker container. Used byjenkins_cli.sh
-
jenkins.sh
- one-touch Jenkins CI, launches in docker, installs plugins, validatesJenkinsfile
, configures jobs from$PWD/setup/jenkins-job.xml
and sets Pipeline to git remote origin'sJenkinsfile
, triggers build, tails results in terminal. Call from any repo top level directory with aJenkinsfile
pipeline andsetup/jenkins-job.xml
(all mine have it) -
concourse.sh
- one-touch Concourse CI, launches in docker, configures pipeline from$PWD/.concourse.yml
, triggers build, tails results in terminal, prints recent build statuses at end. Call from any repo top level directory with a.concourse.yml
config (all mine have it), mimicking structure of fully managed CI systems -
gocd.sh
- one-touch GoCD, launches in docker, (re)creates config repo ($PWD/setup/gocd_config_repo.json
) from which to source pipeline(s) (.gocd.yml
), detects and enables agent(s) to start building. Call from any repo top level directory with a.gocd.yml
config (all mine have it), mimicking structure of fully managed CI systems -
perl*.sh
- various Perl utilities eg:perl_cpanm_install.sh
- bulk installs CPAN modules from mix of arguments / file lists / stdin, accounting for User vs System installs, root vs user sudo, Perlbrew / Google Cloud Shell environments, Mac vs Linux library paths, ignore failure option, auto finds and reads build failure log for quicker debugging showing root cause error in CI builds logs etcperl_cpanm_install_if_absent.sh
- installs CPAN modules not already in Perl libary path (OS or CPAN installed) for faster installations only where OS packages are already providing some of the modules, reducing time and failure rates in CI buildsperlpath.sh
- prints all Perl libary search paths, one per lineperl_find_library_path.sh
- finds directory where a CPAN module is installed - without args finds the Perl library baseperl_find_library_executable.sh
- finds directory where a CPAN module's CLI program is installed (system vs user, useful when it gets installed to a place that isn't in your$PATH
, wherewhich
won't help)perl_find_unused_cpan_modules.sh
- finds CPAN modules that aren't used by any programs in the current directory treeperl_find_duplicate_cpan_requirements.sh
- finds duplicate CPAN modules listed for install more than once under the directory tree (useful for deduping module installs in a project and across submodules)perl_generate_fatpacks.sh
- creates Fatpacks - self-contained Perl programs with all CPAN modules built-in
-
python*.sh
- various Python utilities eg:python_compile.sh
- byte-compiles Python scripts and libraries into.pyo
optimized filespython_pip_install.sh
- bulk installs PyPI modules from mix of arguments / file lists / stdin, accounting for User vs System installs, root vs user sudo, VirtualEnvs / Anaconda / GitHub Workflows/ Google Cloud Shell, Mac vs Linux library paths, and ignore failure optionpython_pip_install_if_absent.sh
- installs PyPI modules not already in Python libary path (OS or pip installed) for faster installations only where OS packages are already providing some of the modules, reducing time and failure rates in CI buildspython_pip_reinstall_all_modules.sh
- reinstalls all PyPI modules which can fix some issuespythonpath.sh
- prints all Python libary search paths, one per linepython_find_library_path.sh
- finds directory where a PyPI module is installed - without args finds the Python library basepython_find_library_executable.sh
- finds directory where a PyPI module's CLI program is installed (system vs user, useful when it gets installed to a place that isn't in your$PATH
, wherewhich
won't help)python_find_unused_pip_modules.sh
- finds PyPI modules that aren't used by any programs in the current directory treepython_find_duplicate_pip_requirements.sh
- finds duplicate PyPI modules listed for install under the directory tree (useful for deduping module installs in a project and across submodules)python_module_to_import_name.sh
- converts PyPI module names to Python import names, used bypython_find_unused_pip_modules.sh
python_pyinstaller.sh
- creates PyInstaller self-contained Python programs with Python interpreter and all PyPI modules included
-
spotify_*.sh
- Spotify API scripts to list playlists, track URIs, artists and track names, create backups of all Spotify playlists, iterate any command against all playlists etc. -
all builds across all my GitHub repos now
make system-packages
beforemake pip
/make cpan
to shorten how many packages need installing, reducing chances of build failures -
Programming language linting:
-
Build System & CI linting:
-
Data format validation using programs from my DevOps Python Tools repo:
Currently utilized in the following GitHub repos:
-
DevOps Python Tools - 80+ DevOps CLI tools for AWS, Hadoop, HBase, Spark, Log Anonymizer, Ambari Blueprints, AWS CloudFormation, Linux, Docker, Spark Data Converters & Validators (Avro / Parquet / JSON / CSV / INI / XML / YAML), Elasticsearch, Solr, Travis CI, Pig, IPython
-
The Advanced Nagios Plugins Collection - 450+ programs for Nagios monitoring your Hadoop & NoSQL clusters. Covers every Hadoop vendor's management API and every major NoSQL technology (HBase, Cassandra, MongoDB, Elasticsearch, Solr, Riak, Redis etc.) as well as message queues (Kafka, RabbitMQ), continuous integration (Jenkins, Travis CI) and traditional infrastructure (SSL, Whois, DNS, Linux)
-
DevOps Perl Tools - 25+ DevOps CLI tools for Hadoop, HDFS, Hive, Solr/SolrCloud CLI, Log Anonymizer, Nginx stats & HTTP(S) URL watchers for load balanced web farms, Dockerfiles & SQL ReCaser (MySQL, PostgreSQL, AWS Redshift, Snowflake, Apache Drill, Hive, Impala, Cassandra CQL, Microsoft SQL Server, Oracle, Couchbase N1QL, Dockerfiles, Pig Latin, Neo4j, InfluxDB), Ambari FreeIPA Kerberos, Datameer, Linux...
-
HAProxy-configs - 80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, Cloudera, Hortonworks, MapR, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, ZooKeeper, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, SSH, RabbitMQ, Redis, Riak, Rancher etc.
-
Dockerfiles - 50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Mesos, Consul, Riak, OpenTSDB, Jython, Advanced Nagios Plugins & DevOps Tools repos on Alpine, CentOS, Debian, Fedora, Ubuntu, Superset, H2O, Serf, Alluxio / Tachyon, FakeS3
-
Perl Lib - Perl utility library
-
PyLib - Python utility library
-
Lib-Java - Java utility library
-
Nagios Plugin Kafka - Kafka Nagios Plugin written in Scala with Kerberos support
Pre-built Docker images are available for those repos (which include this one as a submodule) and the "docker available" icon above links to an uber image which contains all my github repos pre-built. There are Centos, Alpine, Debian and Ubuntu versions of this uber Docker image containing all repos.
Optional, only if you don't do the full make install
.
Install only OS system package dependencies and AWS CLI via Python Pip (doesn't symlink anything to $HOME
):
make
Adds sourcing to .bashrc
and .bash_profile
and symlinks dot config files to $HOME
(doesn't install OS system package dependencies):
make link
undo via
make unlink
Install only OS system package dependencies (doesn't include AWS CLI or Python packages):
make system-packages
Install AWS CLI:
make aws
Install Azure CLI:
make azure
Install GCP GCloud SDK (includes CLI):
make gcp
Install GCP GCloud Shell environment (sets up persistent OS packages and all home directory configs):
make gcp-shell
Install generically useful Python CLI tools and modules (includes AWS CLI, autopep8 etc):
make python
> make help
Usage:
Common Options:
make help show this message
make build installs all dependencies - OS packages and any language libraries via native tools eg. pip, cpanm, gem, go etc that are not available via OS packages
make system-packages installs OS packages only (detects OS via whichever package manager is available)
make test run tests
make clean removes compiled / generated files, downloaded tarballs, temporary files etc.
make submodules initialize and update submodules to the right release (done automatically by build / system-packages)
make cpan install any modules listed in any cpan-requirements.txt files if not already installed
make pip install any modules listed in any requirements.txt files if not already installed
make python-compile compile any python files found in the current directory and 1 level of subdirectory
make pycompile
make github open browser at github project
make readme open browser at github's README
make github-url print github url and copy to clipboard
make ls-files print list of files in project
make ls-code print list of code files, excluding READMEs and other peripheral files
make wc show line counts of the files and grand total
make wc-code show line counts of only code files and total
Repo specific options:
make install builds all script dependencies, installs AWS CLI, symlinks all config files to $HOME and adds sourcing of bash profile
make link symlinks all config files to $HOME and adds sourcing of bash profile
make unlink removes all symlinks pointing to this repo's config files and removes the sourcing lines from .bashrc and .bash_profile
make python-desktop installs all Python Pip packages for desktop workstation listed in setup/pip-packages-desktop.txt
make perl-desktop installs all Perl CPAN packages for desktop workstation listed in setup/cpan-packages-desktop.txt
make ruby-desktop installs all Ruby Gem packages for desktop workstation listed in setup/gem-packages-desktop.txt
make golang-desktop installs all Golang packages for desktop workstation listed in setup/go-packages-desktop.txt
make nodejs-desktop installs all NodeJS packages for desktop workstation listed in setup/npm-packages-desktop.txt
make desktop installs all of the above + many desktop OS packages listed in setup/
make mac-desktop all of the above + installs a bunch of major common workstation software packages like Ansible, Terraform, MiniKube, MiniShift, SDKman, Travis CI, CCMenu, Parquet tools etc.
make linux-desktop
make ls-scripts print list of scripts in this project, ignoring code libraries in lib/ and .bash.d/
make wc-scripts show line counts of the scripts and grand total
make wc-scripts2 show line counts of only scripts and total
make vim installs Vundle and plugins
make tmux installs TMUX plugin for kubernetes context
make ccmenu installs and (re)configures CCMenu to watch this and all other major HariSekhon GitHub repos
make status open the Github Status page of all my repos build statuses across all CI platforms
make aws installs AWS CLI tools
make azure installs Azure CLI
make azure-shell sets up Azure Cloud Shell (limited, doesn't install OS packages since there is no sudo)
make gcp installs GCloud SDK
make gcp-shell sets up GCP Cloud Shell: installs core packages and links configs
(future boots then auto-install system packages via .customize_environment hook)
Now exiting usage help with status code 3 to explicitly prevent silent build failures from stray 'help' arguments
make: *** [help] Error 3
(make help
exits with error code 3 like most of my programs to differentiate from build success to make sure a stray help
argument doesn't cause silent build failure with exit code 0)