You are not logged in. Login Now
 0-13          
 
Author Message
cross
The Algorthims item Mark Unseen   Sep 16 23:45 UTC 2006

This is the algorithms item.  An algorithm is essentially a technique for
doing something; for instance, sorting an array of integers or list of items.
Algorithms are fundamental to computing, and form a basic part of any
programmer's education.  They are also often independent of any particular
language or environment.  Discuss them here.
13 responses total.
herasleftnut
response 1 of 13: Mark Unseen   Oct 7 17:43 UTC 2006

Okay, I'm not drawing a clear distinction between the worst case scenario
between Binary Search Trees and QuickSort. I know both of these algorithms
can be represented as a tree. I think the correct mathematical term in an
acyclic graph. I would look this up, but like the CR math book is holding my
bedroom window.

Anyhow, I can see the Binary Search Tree turning into a linked list in the
worst case scenario. In this case it would linear time (vs logarithmic).
However, for the Quicksort Algorithm, I don't see how it can be quadratic time
in the worst case scenario if the tree also degenerates to a linked list? I
know this be calculated used mathematicl induction, but I totally forgot how
do that kind of shit.

herasleftnut
response 2 of 13: Mark Unseen   Oct 7 17:45 UTC 2006

er *CRC Standard Mathematical Tables and Formulae*
cross
response 3 of 13: Mark Unseen   Oct 7 18:19 UTC 2006

Regarding #1; Quicksort doesn't turn anything into a tree, though some
textbooks use a graphic that looks like a tree to represent how the
algorithm recursively splits up the array as it operates (they often do this
when describing the algorithm's space complexity).  This may be confusing
you....

With a basic tree, yes, the worst case for insertions is if the data is
pre-sorted, in which case everything goes in on one side or the other of the
tree (depending on how it's sorted), and the tree turns into a glorified
linked list and access times are O(n) (e.g., bounded by some constant and
n).

Quicksort just operates on some linear data structure, sorting the elements
in place.  It does this recursively.  Basically, in each invocation, the
algorithm picks the middle element of the array, calls it the "pivot", and
walks over the array.  If, during its walk, it finds an element on the left
side of the pivot that's greater than or equal to the pivot, it tries to
find an element on the right side of the pivot that has a value less than
that of the pivot and swaps the two.  Note that the pivot might move around
in the array during this process.  But, by the time you get to the pivot
from both sides, you're guaranteed that everything on the left side of the
pivot is strictly less than the pivot value, and everything on the right
side is greater than or equal to the pivot value.  Then you call quicksort
twice on subarrays: once to the left of the pivot value, and once to the
right.

Now, this clearly splits the array in half each time, right?  So one would
think that it would take on the order of O(n lg n) times (where lg is
logarithm base 2), since I'm recursing O(lg n) times and each time I'm
walking over O(n) elements of the array, for a total time complexity of
O(n)*O(lg n) = O(n lg n).  But what happens if I pick a bad pivot?  Say,
nothing is ever greater than the pivot, so *every other element* lies to the
left of the pivot value.  Then, the pivot sort of moves over to the far
right hand side of the array, and when I recurse, I walk over n - 1 elements
on the left.  Suppose that, every time I recurse, I end up wind up with a
similarly bad pivot with n - k elements on the left (for the k'th
recursion).  Then it's not hard to see that I'm ending up with O(n)
recrusions instead of O(lg n), and one each, I'm dealing with O(n) elements
for a total time complexity of O(n)*O(n) = O(n)^2 = O(n^2).

So, for a poor choice of pivot, I get pathlogical behavior that's bounded by
a term that is quadratic in the size of the input.
herasleftnut
response 4 of 13: Mark Unseen   Oct 7 21:52 UTC 2006

Okay, cool. That clears up a lot of shit. 
cross
response 5 of 13: Mark Unseen   Oct 7 22:00 UTC 2006

Yeah.  The secret to understanding the worst case scenario for quicksort
running time is the choice of pivot: you don't *have* to pick the middle
element, but if the pivot is chosen poorly, the time will be bad.
sholmes
response 6 of 13: Mark Unseen   Oct 8 00:00 UTC 2006

There are techniques for choosing pivots which minimize(thought not entirely
negate ) such worst cases. Moreover the qsort() that is supplied with glibc
is not a pure quicksort. When the size of the array after a few recursion
gets smaller than a particular value , it invokes insertion sort on them.
cross
response 7 of 13: Mark Unseen   Oct 8 03:17 UTC 2006

Yes, there are many techniques for choosing a pivot to minimize the chances
of pathological behavior.  McIlroy and Bentley wrote an SP&E on it a few
years back; their sort is similar if not identical to what comes with BSD
and maybe some other operating systems.  Let me see if I can find a link to
that paper....  Here's one, though it's not where I'd expected it:

http://www.enseignement.polytechnique.fr/profs/informatique/Luc.Maranget/42
1/09/bentley93engineering.pdf

The optimization where you stop quicksort once you have a sufficiently small
subarray and instead employ something else (e.g., insertion sort) is based
on the observation that while quicksort nominally has O(n lg n) behavior, in
practice there are fairly large constant factors that bound the run time.
Insertion sort, which has runs on order O(n^2) has smaller constant factors
and thus can be much faster than repeated recursions on small arrays than
quicksort.  Further, for arrays that are `close' to sorted, insertion sort
runs in near linear time.  In general, there's some cutoff-point at which
it's actually cheaper to use the slower algorithm.  I believe the idea is
due to Sedgewick, but it's to optimize the constant portion of the runtime
of the algorithm (the stuff you don't see inside the big O), not so much to
minimize the effects of chosing a bad pivot....
mcnally
response 8 of 13: Mark Unseen   Oct 8 03:34 UTC 2006

 I'm a heapsort fan, mostly because the whole heap concept is so cool
 once you finally get it.  Also, it doesn't have the O(n^2) worst-case
 performance problem that quicksort has..  
cross
response 9 of 13: Mark Unseen   Oct 8 03:54 UTC 2006

That's true, though in the non-pathological case, heapsort tends to be a bit
slower than quicksort.  Still, it's good if you need bounded O(n lg n) running
time.  Mergesort is also a good one; I remember sitting in a CS class once
where the professor used the standard ``you have a lot of data on tape that
won't fit into core...'' example for the rationale for mergesort.  The guy
sitting next to me (who I knew socially) looked at me and said with a sort
of grimmace, ``tape?!''  To which I replied, ``cache.''
sholmes
response 10 of 13: Mark Unseen   Oct 8 12:36 UTC 2006

Moreover data in businss are usually somewhat sorted, e.g. the cheques are
usually encashed in serial number order more or less.
cross
response 11 of 13: Mark Unseen   Oct 8 16:36 UTC 2006

Yeah.  Often, as a design choice, I've found it best to not worry about
sorting data, but rather, to store it in some sort of pre-sorted container
(like a balance tree structure or something).  Then, the cost of sorting
the data is amortized over the lifetime of total execution time of the
program.
easlern
response 12 of 13: Mark Unseen   Dec 8 15:34 UTC 2006

Does anybody know where I can look to learn about different algorithms for
integration? It'd be great if there were some info on complexity too. I found
some info on the Risch algorithm, which is used by Maple, but no mention of
its complexity or method. The usual haunts (Wikipedia, Google, Wolfram) have
failed me.  :(
cross
response 13 of 13: Mark Unseen   Dec 8 16:23 UTC 2006

The Numerical Recipes book has a lot of information on such things.
 0-13          
Response Not Possible: You are Not Logged In
 

- Backtalk version 1.3.30 - Copyright 1996-2006, Jan Wolter and Steve Weiss