data manipulation in python examples

data manipulation in python examples

Most applications involve some form of data manipulation, whether it’s simply adding a few numbers together or extracting the individual fields from a log file and creating a report. In either case, you need to manipulate information  from one form or value into another.

Data manipulation in Python relies on a combination of the built-in operators and external modules that provide additional functionality. For example, the math module provides trigonometric functions so you can calculate the length of the side of a triangle given just the angle and the lengths of the opposite sides.

For manipulating text there are a number of solutions available, including the built-in operators and string functions. For more advanced operations you can use regular expressions to match elements with a remarkable degree of control.

This chapter looks at both aspects of data manipulation and explains how to handle special data types, including binary structures and Python’s Unicode system.

Manipulating Numbers

Most numeric operations can be performed using the range of built-in operators that work directly with number objects. All of the basic math operations such as addition, subtraction, raising numbers to the power of another value, and so on, are supported natively by the Python operators.

For other mathematical and trigonometric functions you need the math or emath (for complex numbers) modules. The random and whrandom modules provide different methods for generating random numbers. All of these modules are discussed in this section.


The math module provides the basic mathematical and trigonometric functions. Note that all the trigonometric operations function in radians, not degrees. You can convert degrees into radians by using the formula d*((2pi)/360), where d is the value in degrees. To convert the value back to degrees, use the formula r*(260/(2*pi)). 

The math module provides two constants, pi, the ratio of the circumference of a circle to its diameter, and e, the natural logarithm.

Table 10-1 lists the functions supported by the math module. The ceil() and floor() functions should be explained further. Both return a whole number, represented as float type, which either the integer component of the supplied expression or the next whole number. This is a similar principle to rounding up or down, except that ceil() always rounds up to the nearest whole number and

acos(x)Returns the arccosine of x.
asin(x)Returns the arcsine of x.
atan(x)Returns the arctangent of x.
atan2(y,x)Returns the atan(y/x).
Ceil(x)Returns the ceiling (the next whole number) of x. 
Cos(x)Returns the cosine of x.
Cosh(x)Returns the hyperbolic cosine of x.
Exp(x)Returns e**x.
Fabs(x)Returns the absolute value of x.
Floor(x)Returns the floor (the whole number) of x.
Fmod (x,y)Returns x %y.
Frexp(x)Returns a Tuple containing the positive mantissa and exponent of x .
Hypot(x,y)Returns the result of Pythagoras’ theorem , i.e., theorem , i.e., sqrt(x**2+y**2).
Idexp(x,i)Returns x*(2**I).
Log(x)Returns the natural logarithm of x.
Log10(x)Returns the base -10 logarithm of x.
Modf(x)Returns a Tuple containing the fractional and integer part of x.
Both values have the same sign as x.
Pow(x,y)Returns x**y.
Sin(x)Returns the sine of x.
Sinh(x)Returns the hyperbolic sine of x.
Sqrt(x)Returns the square root of x.
Tan(x)Returns the tangent of x.
Tanh(x)Returns the hyperbolic tangent of x.
Table 10-1. Function supported by the math module

floor() always rounds down to the nearest whole number. You can see the results of ceil() and floor() when using the functions interactively:

>>> from math import*

>>>ceil (9)




>>> ceil(8.1)


>>> floor(9.9)




The cmath module is identical to the math module except that it operates on complex numbers. The cmath module also provides complex number versions of the pi and e constants. Table 10-2 lists the functions supported by the cmath module.

Random Numbers

Calculating a random number can be useful in many areas of programming. Introducing a random element is a great way to add a unique or impossible-to-guess value to an ID or reference number and can also be used in applications such as games where you need an unexpected or unpredictable result.

True random number production is difficult, if not impossible, for a computer purely because computers are designed to work with precise and predictable numbers Humans are, of course, much better at producing random numbers, but you can’t always have a handy human to generate your random numbers!

Most random number generators are officially listed as pseudo-random number generators, and most rely on imperfections or minor differences in calculations to produce a random result. None of these should be relied on solely to produce a random number for a temporary ID. See the sidebar “Creating Random IDs for methods of creating unique IDs using random and static elements.

The standard Python distribution includes two modules for random numbers. random and whrandom. The random module provides a number of different functions for calculating random numbers in addition, the random module exports the functions

acos(x)Returns the arccosine of x.
acosh(x)Returns the arc hyperbolic cosine of x.
asin(x)Returns the arcsine of x.
asinh(x)Returns the are hyperbolic sine of x.
atan(x)Returns the arctangent of x.
atanh(x)Returns the arc hyperbolic tangent of x.
cos(x)Returns the cosine of x.
Cosh(x)Returns the hyperbolic cosine of x.
exp(x)Returns e**r
log(x)Returns the natural logarithm of x.
log10(x)Returns the base-10 logarithm of x.
Sin(x) Returns the sine of x.
Sinh(x)Returns the hyperbolic sine of x.
Sqrt(x)Returns the square root of x.
tan(x)Returns the tangent of x.
tanh(x)Returns the hyperbolic tangent of x.
Table 10-2. Functions Supported by the cmath Module

in whrandom. The whrandom module provides a more familiar interface for creating Tandom numbers (using the Wichmann-Hull algorithm) with just four basic functions:

randint(), random(), seed(), and choice(). The seed() function seeds the random number generator and accepts three arguments:

Creating Random IDs
If you want to create a session ID that must be unique, you cannot use random numbers alone; although for a basic definition of random, the numbers generated will be random, the reality is that because of their nature, eventually a natural sequence will occur.
You can get around this by using a combination of the current time and a random number. You can calculate the current time to the nearest second (actually, you can calculate it more precisely, but accuracy to the second is supported on all platforms). So if the session is created fresh each time, it will require an exceedingly large number of requests at exactly the same instance in time in order for the entire ID string to be duplicated. Although there are many methods for this. the one I've used for years is as follows:
def make_session_id ( ):
         from warandom import randint
         from time import gmtime
         (a,b,c) =(randint (0,9999),
                   randint (0,9999), 
                   randint (0,9999))

         (year, month, day, hour, minute, second) gmtime()[0:6]
session = "%02d%04%02d-%02d%%02d%04d-%d%d%d" %\
          (Second, a, haur,month,minate,b,c,day, year)
return session

The result of make_session_id, when called, is a string that looks something like 52854910-08398569-891732001, which while not guaranteed to be completely random and unique, is probably close enough given the current CPU limits I've tested the result produced using the same physical time for 10,000,000 combinations and never found a duplicate, so it must be reasonably reliable.

If you don’t supply any values or the three arguments have the same value, the current time is used as a seed value. The seed() function is automatically called if you haven’t already called the function the first time you use one of the other three functions. The random() function returns a random number in the range 0,0 to 1.0:

>>> whrandom.random() 0.44718597724016607 

>>> whrandon.random() 9.93284180701215091

>>> whrandom.random ()


The randint() function accepts two arguments that indicate the range of the integer number to be generated. For example, to produce a random number between 1 and 256 (inclusive):

>>> whrandom.randint(1,256)


To produce a random number between 100 and 1000:

>>>whrandom.randint(100, 1000)


Finally, the choice( ) function randomly selects one of the elements from the supplied sequence. Consider the following example:

>>> whrandom.choice([1,3,5,7,11,13,17,19,23,29]) 


Note that the item isn’t removed from the sequence; it is only selected and returned. The random module exports functions to generate random numbers using different distributions on real numbers. All of the functions are likely to produce numbers with à random deviation better than the base randint() and random() functions from the whrandom module, but with a slightly performance overhead. Table 10-3 lists the functions exported by the random module.

betavariate(alpha, beta)Returns a value between 0 and 1 from the Beta distribution, alpha and beta should be greater than-1.
cunifivariate(mean, arc)Returns a value between (mean-arc/2) and (mean+arc/2) from the circular uniform distribution.
expovariate(lambda)Returns a value between 0 and infinity from the exponential distribution.
Gamma(alpha, beta) Returns a value from the Gamma distribution where alpha should be greater than-1 and beta should be greater than 0.
gauss(mu,sigma)Returns a value from the Gaussian distribution with mean mu and standard deviation sigma.
Lognromvariate(mu,sigma)Returns a value from the log normal distribution with mean mu and standard deviation sigma.
normalvariate(mu, sigma)Returns a value from the normal distribution with mean mu and standard deviation sigma.
Paretovariate(alpha)Returns a value from the paretovariate distribution with shape parameter alpha.
Vonmissesvariate(mu,Kappa)Returns a number from the von Mises variation, where mu is the mean angle radians between 0 and 2*pi and kappa is a positive concentration factor.
Weibullvariate(alpha, beta)Returns a value in the Weibull variation with scalar parameter alpha and shape parameter beta.
Table 10-3. Functions Exported by the random module 

Previous articlewhat is class in java
Next articlepython text manipulation(Splitting)



Please enter your comment!
Please enter your name here