Seventh Sense Rambling about life's little things, in 7 ≡ 1 (mod 6) fashion

« | »

BASH – GZIP or BZIP2?

BASH is a free software Unix shell written for the GNU Project. Its name is an acronym which stands for Bourne-again shell. The name is a pun on the name of the Bourne shell (sh), an early and important UNIX shell written by Stephen Bourne and distributed with Version 7 Unix circa 1978, and the concept of being born again. BASH was created in 1987 by Brian Fox. In 1990 Chet Ramey became the primary maintainer. BASH is the default shell on most GNU/Linux systems as well as on Mac OS X and it can be run on most UNIX-like operating systems. It has also been ported to Microsoft Windows using the POSIX emulation provided by Cygwin, to MS-DOS by the DJGPP project and to Novell NetWare.

AWK is a general purpose programming language that is designed for processing text-based data, either in files or data streams, and was created at Bell Labs in the 1970s. The name AWK is derived from the family names of its authors — Alfred Aho, Peter Weinberger, and Brian Kernighan; however, it is not commonly pronounced as a string of separate letters but rather to sound the same as the name of the bird, auk. awk, when written in all lowercase letters, refers to the UNIX or Plan 9 program that runs other programs written in the AWK programming language. AWK is an example of a programming language that extensively uses the string data type, associative arrays (that is, arrays indexed by key strings), and regular expressions. The power, terseness, and limitations of AWK programs and sed scripts inspired Larry Wall to write PERL. Because of their dense notation, all these languages are often used for writing one-liner programs. AWK is one of the early tools to appear in Version 7 UNIX and gained popularity as a way to add computational features to a UNIX pipeline. A version of the AWK language is a standard feature of nearly every modern UNIX-like operating system.


The Script

Often times, it becomes necessary to compress files to save (or make) space (for more files??). With the availability of gzip and bzip2 on most Linux distributions, often times one is left to ponder – which mechanism is better for a given file? Although it’s a common understanding that bigger the file size, bzip2 performs better, the following script is expected to help one in that matter. It compresses a given file with both options, compares the resulting filesize with the original (uncompressed) file and then makes a decision to retain one of the three (uncompressed, gzipped or bzipped).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#! /bin/bash
 
# BASH script that takes ONE filename as a (mandatory) argument
# and compresses them using both 'GZIP' and 'BZIP2'. Then, compares
# the filesizes and retains the one with the smallest value.
# 
# Usage: ./optimal_compression.sh [FILENAME]
#
# 01 September, 2006
# Tue, 14 Oct 2008 12:53:11 -0400
# Sun, 19 Oct 2008 12:08:36 -0400
 
if [[ $# -ne 1 ]];
then
  # Display error message when no (or more than one) files are specified as arguments
  echo
  echo " You must specify a filename"
  echo
  exit
else
 
  # Assign the supplied filename to a variable
  FILENAME=$1
 
  # Other useful variables
  TODAY=`date +"%Y%m%d_%H%M%S"`
  BZIP=`which bzip2`
  GZIP=`which gzip`
 
  # Temporary files that contain compressed data
  BZIP_FILE="/tmp/tmp_$TODAY.bz2"
  GZIP_FILE="/tmp/tmp_$TODAY.gz"
 
 
  # Compress the files with gzip and bzip2
  # One may add '&' at the end of the lines below and
  # uncomment 'wait' line, if really big files are being
  # compressed. It may save time. However, the definition
  # OPTIMUM might need some modification if done so.
  $BZIP < $FILENAME > $BZIP_FILE
  $GZIP < $FILENAME > $GZIP_FILE
  # wait
 
  # Compare file size - of normal, gzipped and bzipped files
  # and determine which one has the smallest (optimal) value.
  # Remember, 'ls -l' sorts filenames alphabetically
  OPTIMUM=`ls -ltr $FILENAME $BZIP_FILE $GZIP_FILE | \
           awk '{print $5":"NR}' | sort -n | \
           awk -F ':' '{if ($1 != "") { print $2 }}' | head -1`
 
  case "$OPTIMUM" in
    1 ) echo
        echo " $FILENAME not compressed."
        echo
        ;;
    2 ) echo
        echo " $FILENAME compressed with $BZIP => $FILENAME.bz2"
        echo
        mv $BZIP_FILE "$FILENAME.bz2" 
        ;;
    3 ) echo
        echo " $FILENAME compressed with $GZIP => $FILENAME.gz"
        echo
        mv $GZIP_FILE "$FILENAME.gz" 
        ;;
  esac
 
  # Remove temporary files 
  rm -f $FILENAME $BZIP_FILE $GZIP_FILE
fi



Optimal Compression

Share
divider

Responses to BASH – GZIP or BZIP2?

  1. bd_ says:

    A jpeg file is probably a bad example – jpeg images are already compressed, and so you’re really testing how well they do on /already compressed/ data – which is going to be up to random chance, really.

  2. Gowtham says:

    @bd_:

    Thanks for the info about JPEG images. I did notice that – one JPEG gave me BZIP2 while this one gave me GZIP; but had no idea why it was so :(

    For screenshot purposes, I couldn’t easily find a file in my computer that would get compresses via GZIP.

  3. Bérénice says:

    Thanks for your support, i need some time to think about this. What you wrote is great advice any way that you look at it.

  4. Hi, I really love your template. Did you make it yourself?

  5. Rayford Flis says:

    Thanks for taking the time to discuss this, I really feel strongly about it and love studying extra on this topic. If attainable, as you gain experience, would you thoughts updating your blog with extra info? This can be very useful for me.

  6. Fumiko Banos says:

    The nice post helped me very much! Saved the website, very excellent topics just about everywhere that I see here! I really appreciate the information, thanks.

  7. fantastic bapsoint that andically gerious with mianeff and action itonedun. prompre all atosy from vionan encetent from colite and cadmiting.

  8. I think this is one of the most important info for me. And i am glad reading your article. But want to remark on some general things, The site style is perfect, the articles is really great : D. Good job, cheers

  9. I have to show some appreciation to you just for bailing me out of this type of incident. Because of researching throughout the world-wide-web and obtaining views which were not helpful, I believed my entire life was done. Existing without the answers to the problems you’ve resolved through the post is a critical case, and those that could have in a negative way affected my entire career if I hadn’t noticed the blog. Your personal understanding and kindness in touching a lot of stuff was invaluable. I am not sure what I would have done if I had not come upon such a point like this. I can at this point look ahead to my future. Thanks for your time so much for your expert and sensible guide. I won’t be reluctant to propose the website to any individual who would need guidance on this issue.

  10. How is it that just anyone can publish a weblog and get as popular as this? Its not like youve said something incredibly impressive -more like youve painted a quite picture above an problem which you know absolutely nothing about! I dont want to sound mean, appropriate here. But do you undoubtedly feel which you can get away with adding some quite pictures and not truly say anything?

  11. airalbania says:

    Youre so cool! I dont suppose Ive learn something like this before. So good to seek out anyone with some unique thoughts on this subject. realy thank you for starting this up. this web site is something that’s needed on the web, someone with a little originality. helpful job for bringing one thing new to the internet!



Leave a Reply

Most of these posts, especially the ones with any hint of technical jargon, are intended to be Note2Self. But if any of them float your boat, then feel free to sail along. If you feel so generous, improve my journey with your comments &/or thoughts!
Looking for MS Thesis or PhD Dissertation Template in LaTeX? Click below!

MTU Create The Future
Twitter



Archives

Planet Kannada


Twitter: @sgowtham Facebook: @sgowtham Linked In: sgowtham