Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Shellsort

Old News

See Also Recommended Books Recommended Links Donald Knuth
Animations Test suits History Humor Etc

Shellsort is the oldest fast sorting algorithms. It was proposed by Shell (1959)  and still in many cases holds its own against other competitors due to its simplicity and the ability to use partially ordered sequences. Knuth found that the shall sort efficiency is close to N(log N)2 and N1.25. A more complicated function of the form  is involved for some sequences.

But please note that shell sort usually works very well on real data, better then on random sequences. What is the most important that it has worst case not that different from best case. For this reason it is often preferred to quicksort which is a fragile algorithms that can perform quadratic on some practically important cases. The other alternatives with known worst case efficiency is mergesort and heapsort. Both sort a file of N elements in time proportional to N log N, no matter what the input.

We can consider shell sort to be an elegant extension of insertion sort that gains speed by allowing exchanges of elements that are far apart. It sorts slices with a particular step h. Such a file is said to be h-sorted. If we first h-sort a file using a large value of h, elements move long distances and the efficiency of h-sorting for smaller values of  h improves.  When you get to the h=1 you implement a regular insertion sort and thus get a sorted file.

But the details of this elegant idea are non-trivial. First of all the result of h-sorting a file that is already k-sorted is a file became both h-and k-sorted. While any sequence of values of h will work as long as ends with h1 = 1, some sequences  are clearly better than others.

Knuth, D. E. The Art of Computer Programming, Vol. 3: Sorting and Searching, 2nd ed. Reading, MA: Addison-Wesley, pp. 83-95, 1998.

Shell, D. L. "A High-Speed Sorting Procedure.' 'Comm. ACM 2, No. 7, 30-32, Jul. 1959.

NEWS CONTENTS

Old News

[Oct 07, 2010] Donald Knuth, P157 Shellsort with three increments.

TeX file swti.tex (15822 bytes)

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

9.4 Shellsort


Shellsort

Shellsort is one of the oldest sorting algorithms, named after its inventor D.L. Shell (1959) [Sh]. It is fast, easy to understand and easy to implement. However, its complexity analysis is a little more sophisticated. Idea The idea of Shellsort is the following: arrange the data sequence in a two-dimensional array sort the columns of the array The effect is that the data sequence is partially sorted.

The process above is repeated, but each time with a narrower array, i.e. with a smaller number of columns. In the last step, the array consists of only one column.

In each step, the sortedness of the sequence is increased, until in the last step it is completely sorted. However, the number of sorting operations necessary in each step is limited, due to the presortedness of the sequence obtained in the preceding steps.

Example: Let 3 7 9 0 5 1 6 8 4 2 0 6 1 5 7 3 4 9 8 2 be the data sequence to be sorted. First, it is arranged in an array with 7 columns (left), then the columns are sorted (right):

3 7 9 0 5 1 6
8 4 2 0 6 1 5
7 3 4 9 8 2
3 3 2 0 5 1 5
7 4 4 0 6 1 6
8 7 9 9 8 2
Data elements 8 and 9 have now already come to the end of the sequence, but a small element (2) is also still there. In the next step, the sequence is arranged in 3 columns, which are again sorted:
3 3 2
0 5 1
5 7 4
4 0 6
1 6 8
7 9 9
8 2
0 0 1
1 2 2
3 3 4
4 5 6
5 6 8
7 7 9
8 9
Now the sequence is almost completely sorted. When arranging it in one column in the last step, it is only a 6, an 8 and a 9 that have to move a little bit to their correct position.

Implementation

Actually, the data sequence is not arranged in a two-dimensional array, but held in a one-dimensional array that is indexed appropriately. For instance, data elements at positions 0, 5, 10, 15 etc. would form the first column of an array with 5 columns.

The "columns" obtained by indexing in this way are sorted with Insertion Sort, since this method has a good performance with presorted sequences.

The following program sorts an array a from index position 0 through n-1. The number of columns used for arranging data in each step is in array cols. Thus, data are arranged in 1391376 columns in the first step and in one column in the last step. (Note that essentially nothing is done if the number of columns h is larger than the number of data elements n.) Each column is sorted by Insertion Sort. First, data of the second row (beginning at i = h) are sorted to the correct position in their column, then data of the third row (when i reaches value 2h) etc.

Algorithm Shellsort

void shellsort (int[] a, int n)
{
    int i, j, k, h, v;
    int[] cols = {1391376, 463792, 198768, 86961, 33936, 13776, 4592,
                    1968, 861, 336, 112, 48, 21, 7, 3, 1}
    for (k=0; k<16; k++)
    {
        h=cols[k];
        for (i=h; i<n; i++)
        {
            v=a[i];
            j=i;
            while (j>=h && a[j-h]>v)
            {
                a[j]=a[j-h];
                j=j-h;
            }
            a[j]=v;
        }
    }
}

Shellsort Applet

This applet illustrates Shellsort with various increment sequences. Select an increment sequence (they are arranged roughly from best to worst). Then select either random or reverse input. (Worst-case input is planned for the future). Then enter N.

You can run the program to see how each phase rearranges the data. Each click of the button executes the next phase and generates a picture made popular in several animation systems (and also Bob Sedgewick's books). A diagonal line is sorted input. You need a sufficiently small N (at most 325 in a typical window) and you need to check the stop after each pass box. A running count of comparisons and data moves is kept. You can also let the program run and get these counts by not checking the box. But then you don't get the picture.

Future plans will include additional and also arbitrary increment sequences, worst-case data, and perhaps an animation (instead of step by step clicking) if I ever figure out this language.

Shellsort This page contains some run time data. It' unclear how the test sets were generated. If they are random they are perfect for quicksrt and this is not a fair exaqmple

Shellsort
PROBLEM

Given an array A. of N elements the result of sorting the array in place is to arrange elements of A. so that

A.1<=A.2<=...<=A.N


ALGORITHM

We first sort all elements that are at distance H.1 from each other, then re-sort the array using increment H.2, finally we do an ordinary insertion sort with increment 1 (H.1>H.2>...>1). D. L. Shell, the originator of the algorithm, took H=N%2 as the first value of H and used the formula H=H%2. A sequence which has been shown empirically to do well is

H=...,1093,364,121,40,13,4,1

i.e H=(3**K-1)/2 which can be generated by the formula H=3*H+1 with the initial value H=1. The number of operations required in all is of order O(N**(3/2)) for the worst possible ordering of the original data. In average the operations count goes approximately as O(N**1.25).


PRACTICE

The following table compares four sorting algorithms. The A. array includes integers in the range 1 to Max=10,100,1000,40000.

Running time in seconds for N=10000
Algorithm Max = 100 Max = 1000 Max = 40000 Max = 99999
QUICKSORT 4.66 4.53 4.89 4.59
HEAPSORT 10.26 10.67 11.58 11.18
SHELLSORT 7.45 8.89 10.58 10.13
COUNTING_SORT 1.88 1.95 7.93 31.85

IMPLEMENTATION
Unit: nonrecursive internal subroutine

Global variables: array A. of arbitrary elements

Parameter: a positive integer N - number of elements in A.

Result: Reordering of input array such that

A.1<=A.2<=...<=A.N


SHELLSORT: procedure expose A.
parse arg N
H = 1
do until H > N; H = 3 * H + 1; end
do until H = 1
H = H % 3
do I = H + 1 to N
V = A.I; J = I; JmH = J - H
do while A.JmH > V
A.J = A.JmH; J = J - H
if J <= H then leave
JmH = J - H
end
A.J = V
end
end
return

CONNECTIONS

Sorting Problem
Counting sort
Heapsort
Quicksort
Merging


Literatura

Rich R. P. Internal Sorting Methods Illustrated with PL/1 Programs Prentice Hall, Inc., Engelwood Cliffs, 1972

Sedgewick R. Algorithms Addison-Wesley, Reading, Massachusetts, 1984


Acknowledgement

I changed the test from Max=10 ... 40000 to Max=100 ... 99999. Thanks for idea go to Walter Pachl.


Note

This test ran in the Windows 2000 Professional environment on the computer with 132MB RAM and processor x86Family 6 Mode 6 Stepping 5.

Analysis of Shellsort and Related Algorithms by Robert Sedgewick Presented at Fourth Annual European Symposium on Algorithms. Barcelona, September, 1996

This paper surveys the theoretical and empirical studies that have been done over the past four decades on the Shellsort algorithm and its variants. The discussion includes: upper bounds, including linkages to number-theoretic properties of the algorithm; lower bounds on Shellsort and Shellsort-based networks; average-case results; proposed probabilistic sorting networks based on the algorithm; and a list of open problems.

Comments, questions, suggestions:

   mail [email protected] 
Improving Shellsort Through Evolution

Shellsort is an in-place sorting algorithm. It is described in many data structures books, including [] and []. The implementation I used is shell.lisp. Notice that the comparison function (lessp) & the copy function (copyfn) are parameters so that other functions in the program may use this implementation of Shellsort to track the cost of a sequence of increments.

In a sense, Shellsort is a parameterized family of sorting algorithms. The parameter is a sequence of integral increments. Each increment is used for one pass through the list being sorted. The increments begin with their largest & decrease in size each time through the algorithm. The final increment must be 1.

The efficiency of Shellsort depends on the sequence of increments. The efficiency has proven difficult to analyze. The most efficient sequence of increments known was proposed by Sedgewick.

Shellsort (implementation in several languages)



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March 12, 2019