Friday, June 10, 2016
Tuesday, April 26, 2016
Finally started to use something fancy (Data.Aeson)
First time aeson user. The magic part is ToJSON/FromJSON instance. Knowing how generics works in Haskell, one could intuitively expect empty ToJSON and FromJSON instance to work as is (no need to write any specialization to serialize/de-serialize ADT) and that will be the discipline I would choose when define APIs.
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
import qualified Data.ByteString.Lazy as L
import qualified Data.Text as Text
import Data.Text(Text)
import Data.Aeson
import GHC.Generics
data Patient = Patient {
firstName :: Text
, lastName :: Text
, emailAddress :: Text
, birthDate :: Text
, latestBloodPressureDate :: Text
, systolic :: Int
, diastolic :: Int
} deriving (Generic, Show)
instance ToJSON Patient where
instance FromJSON Patient where
patients :: L.ByteString -> Maybe [Patient]
patients = decode
main = L.getContents >>= print . patients
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}
import qualified Data.ByteString.Lazy as L
import qualified Data.Text as Text
import Data.Text(Text)
import Data.Aeson
import GHC.Generics
data Patient = Patient {
firstName :: Text
, lastName :: Text
, emailAddress :: Text
, birthDate :: Text
, latestBloodPressureDate :: Text
, systolic :: Int
, diastolic :: Int
} deriving (Generic, Show)
instance ToJSON Patient where
instance FromJSON Patient where
patients :: L.ByteString -> Maybe [Patient]
patients = decode
main = L.getContents >>= print . patients
Wednesday, April 13, 2016
generate all combinations of given length for [1..k]
> selections 2 4
[[1,1],[1,2],[1,3],[1,4],[2,1],[2,2],[2,3],[2,4],[3,1],[3,2],[3,3],[3,4],[4,1],[4,2],[4,3],[4,4]]
selections :: Int -> Int -> [[Int]]
selections 1 k = fmap pure [1..k]
selections n k = liftA2 (:) [1..k] (selections (n-1) k)
total number of selections = k ^ n
Tuesday, April 5, 2016
Generate BST for an ordered list
It seems hard when first heard of this. However, it's way more easier they I thought:
import Data.Array
import Data.List
data BTree a = Leaf | Branch a (BTree a) (BTree a) deriving Show
-- | the type signature can be omitted, use it here for documentation
makeBST :: Ord a => [a] -> BTree a
makeBST xs = acc (listArray (1, len) (sort xs)) 1 len
where len = length xs
acc u i j
| i > j = Leaf
| i <= j = Branch (u ! k) (acc u i (pred k)) (acc u (succ k) j)
where k = (i+j) `div` 2
The idea is extremely simple, for a (sub) list already sorted, divide it to
[left] ++ [pivot] ++ [right]
make pivot to be the root node, then call makeBST recursively on both @left and @right, and inserted it to the root node.
I've spotted C++ implementation which made me headache, however, the Haskell version feels like a one-liner.
Guess I've found another example of my theory:
To learn algorithm, you probably shouldn't read any code at all, not even pseudo code.
There are some outstanding books on algorithms, what I don't like is the imperative style pseudo code, and assume you have to implement the algorithm with C/C++/Java, which is really hard to understand. To me, if I tried to read into the code, I would forget the algorithm itself pretty quick, 3 month later, I forget the code altogether; Instead, tried to understand the algorithm without any code (or at least functional ones, draw graphs if necessary). It may take longer when implement some algorithms than memorizing the code, the upside is this kind of memory last really long.
import Data.Array
import Data.List
data BTree a = Leaf | Branch a (BTree a) (BTree a) deriving Show
-- | the type signature can be omitted, use it here for documentation
makeBST :: Ord a => [a] -> BTree a
makeBST xs = acc (listArray (1, len) (sort xs)) 1 len
where len = length xs
acc u i j
| i > j = Leaf
| i <= j = Branch (u ! k) (acc u i (pred k)) (acc u (succ k) j)
where k = (i+j) `div` 2
The idea is extremely simple, for a (sub) list already sorted, divide it to
[left] ++ [pivot] ++ [right]
make pivot to be the root node, then call makeBST recursively on both @left and @right, and inserted it to the root node.
I've spotted C++ implementation which made me headache, however, the Haskell version feels like a one-liner.
Guess I've found another example of my theory:
To learn algorithm, you probably shouldn't read any code at all, not even pseudo code.
There are some outstanding books on algorithms, what I don't like is the imperative style pseudo code, and assume you have to implement the algorithm with C/C++/Java, which is really hard to understand. To me, if I tried to read into the code, I would forget the algorithm itself pretty quick, 3 month later, I forget the code altogether; Instead, tried to understand the algorithm without any code (or at least functional ones, draw graphs if necessary). It may take longer when implement some algorithms than memorizing the code, the upside is this kind of memory last really long.
Sunday, March 20, 2016
match incremental input with predefined dictionary
Requirement:
1) a dictionary with many words (String) without duplicates
2) an input feed with many continuous characters (String)
3) input keep growing (append new character) if any word in the dictionary is a prefix (including equal) of the input stream, in this case the word need to be recorded, and continue 3); if none is a prefix of the input stream, then input terminate, return the result
Looks like a Trie can be used for such task, for every iteration when input grows, check if we can find input stream from the dictionary (converted to a Trie); however, it's not easy to implement efficient and pure functional Trie in Haskell. Thus here the alternative.
The idea is for every iteration, we keep reduce the candiates, if any candiates has a full match, we then could return the candiates; if there's no candidates, we then could let the caller terminate input stream.
candidate is defined as:
-- type Candidate = (String, [String])
The first part is the prefix already matched, second part is the substring of the candidates with prefix (first) removed;
result is aggregated by a Writer Monad;
return value could be either one of:
data MatchStatus = More -- we have candidates, keep growing input stream;
| Done -- No more candidate, terminate input
deriving Show
The monadic code makes the logic quite handy.
incr c = do
(matched, rems) <- br="" get=""> let rems' = fmap tail . filter (\xs -> if null xs then False else if (head xs) /= c then False else True) $ rems
matched' = c:matched
put (matched', rems')
when (any null rems') ( tell (Seq.singleton (reverse matched')) )
return $! if null rems' then Done else More-><- br="" get="">->
<- br="" get="">-><- br="" get="">
-- dict1 = ["xde", "xd", "efgh"]
-- runIdentity . runRWST (incr 'x' >> incr 'd' >> incr 'e' >> incr 'f') 0 $ ("", dict1)
-- (Done,("fedx",[]),fromList ["xd","xde"])->
1) a dictionary with many words (String) without duplicates
2) an input feed with many continuous characters (String)
3) input keep growing (append new character) if any word in the dictionary is a prefix (including equal) of the input stream, in this case the word need to be recorded, and continue 3); if none is a prefix of the input stream, then input terminate, return the result
Looks like a Trie can be used for such task, for every iteration when input grows, check if we can find input stream from the dictionary (converted to a Trie); however, it's not easy to implement efficient and pure functional Trie in Haskell. Thus here the alternative.
The idea is for every iteration, we keep reduce the candiates, if any candiates has a full match, we then could return the candiates; if there's no candidates, we then could let the caller terminate input stream.
candidate is defined as:
-- type Candidate = (String, [String])
The first part is the prefix already matched, second part is the substring of the candidates with prefix (first) removed;
result is aggregated by a Writer Monad;
return value could be either one of:
data MatchStatus = More -- we have candidates, keep growing input stream;
| Done -- No more candidate, terminate input
deriving Show
The monadic code makes the logic quite handy.
incr c = do
(matched, rems) <- br="" get=""> let rems' = fmap tail . filter (\xs -> if null xs then False else if (head xs) /= c then False else True) $ rems
matched' = c:matched
put (matched', rems')
when (any null rems') ( tell (Seq.singleton (reverse matched')) )
return $! if null rems' then Done else More-><- br="" get="">->
<- br="" get="">-><- br="" get="">
-- dict1 = ["xde", "xd", "efgh"]
-- runIdentity . runRWST (incr 'x' >> incr 'd' >> incr 'e' >> incr 'f') 0 $ ("", dict1)
-- (Done,("fedx",[]),fromList ["xd","xde"])->
Saturday, March 19, 2016
C interview quiz: how to reverse a string
So I heard someone was asked how to reverse a C string maybe in a coding interview. seems pretty easy? At least it's very easy come into some solution like this:
char* reverse(char* s, int n)
{
int i;
char ch;
for (i = 0; i < n / 2; i++) {
ch = s[i];
s[i] = s[n - i -1];
s[n -i - 1] = ch;
}
return s;
}
Well, this might can pass the coding test, for a serious C programmer, I don't think that's the right solution. The problem is as a low level C programmer, you need think like machines, machines do terribly when handling with bytes, but much better with words. Keep this in mind, we could find a much better solution (even it's longer):
static inline void swap64(long long* x, long long* y)
{
long long t = *x;
*x = *y;
*y = t;
}
static inline void swap32(int* x, int* y)
{
int t = *x;
*x = *y;
*y = t;
}
static inline void swap16(short* x, short* y)
{
short t = *x;
*x = *y;
*y = t;
}
static inline void swap8(char* x, char* y)
{
char t = *x;
*x = *y;
*y = t;
}
static inline char* fastrev(char* s, int i, int j)
{
while (i + 16 <= j) {
long long *l1 = (long long*)(s+i);
long long *l2 = (long long*)(s+j-8);
*l1 = bswap_64(*l1);
*l2 = bswap_64(*l2);
swap64(l1, l2);
i += 8;
j -= 8;
}
while(i + 8 <= j) {
int* i1 = (int*)(s+i);
int* i2 = (int*)(s+j-4);
*i1 = bswap_32(*i1);
*i2 = bswap_32(*i2);
swap32(i1, i2);
i += 4;
j -= 4;
}
while(i + 4 <= j) {
short* i1 = (short*)(s+i);
short* i2 = (short*)(s+j-2);
*i1 = bswap_16(*i1);
*i2 = bswap_16(*i2);
swap16(i1, i2);
i += 2;
j -= 2;
}
while (i < j) {
char* c1 = s+i;
char* c2 = s+j-1;
swap8(c1, c2);
i++;
j--;
}
return s;
}
char* reverse(char* s, int n)
{
return fastrev(s, 0, n);
}
While there are so much big tech asking algorithm questions (problem is 90% of exactly the same questions can be found online with answers..), reverse a string might not a laughable problem at all!
char* reverse(char* s, int n)
{
int i;
char ch;
for (i = 0; i < n / 2; i++) {
ch = s[i];
s[i] = s[n - i -1];
s[n -i - 1] = ch;
}
return s;
}
Well, this might can pass the coding test, for a serious C programmer, I don't think that's the right solution. The problem is as a low level C programmer, you need think like machines, machines do terribly when handling with bytes, but much better with words. Keep this in mind, we could find a much better solution (even it's longer):
static inline void swap64(long long* x, long long* y)
{
long long t = *x;
*x = *y;
*y = t;
}
static inline void swap32(int* x, int* y)
{
int t = *x;
*x = *y;
*y = t;
}
static inline void swap16(short* x, short* y)
{
short t = *x;
*x = *y;
*y = t;
}
static inline void swap8(char* x, char* y)
{
char t = *x;
*x = *y;
*y = t;
}
static inline char* fastrev(char* s, int i, int j)
{
while (i + 16 <= j) {
long long *l1 = (long long*)(s+i);
long long *l2 = (long long*)(s+j-8);
*l1 = bswap_64(*l1);
*l2 = bswap_64(*l2);
swap64(l1, l2);
i += 8;
j -= 8;
}
while(i + 8 <= j) {
int* i1 = (int*)(s+i);
int* i2 = (int*)(s+j-4);
*i1 = bswap_32(*i1);
*i2 = bswap_32(*i2);
swap32(i1, i2);
i += 4;
j -= 4;
}
while(i + 4 <= j) {
short* i1 = (short*)(s+i);
short* i2 = (short*)(s+j-2);
*i1 = bswap_16(*i1);
*i2 = bswap_16(*i2);
swap16(i1, i2);
i += 2;
j -= 2;
}
while (i < j) {
char* c1 = s+i;
char* c2 = s+j-1;
swap8(c1, c2);
i++;
j--;
}
return s;
}
char* reverse(char* s, int n)
{
return fastrev(s, 0, n);
}
(I didn't use swap macro with the well-known `xor` trick, because `xor` on non-words doesn't necessarily faster).
The 2nd solution is 3 times faster on my computer when revering 10^6 bytes string repeated 1000 times (have to do this, otherwise the bottle neck is I/O). Main function:
int main(int argc, char* argv[])
{
char* line;
size_t n;
ssize_t k;
int loop, loopcnt = 1000;
int nt;
scanf("%d\n", &nt);
while(nt--) {
k = getline(&line, &n, stdin);
line[k-1] = '\0';
loop = loopcnt;
while(loop--) {
reverse(line, k-1);
}
printf("%s\n", reverse(line, k-1));
}
free(line);
return 0;
}
{
char* line;
size_t n;
ssize_t k;
int loop, loopcnt = 1000;
int nt;
scanf("%d\n", &nt);
while(nt--) {
k = getline(&line, &n, stdin);
line[k-1] = '\0';
loop = loopcnt;
while(loop--) {
reverse(line, k-1);
}
printf("%s\n", reverse(line, k-1));
}
free(line);
return 0;
}
Timing Result:
[wangbj@nuc tmp]$ ls -l /tmp/i5.txt
-rw-r--r-- 1 wangbj users 1000003 Mar 19 00:43 /tmp/i5.txt
[wangbj@nuc tmp]$ time cat /tmp/i5.txt | ./rev > /dev/null
real 0m0.637s
user 0m0.621s
sys 0m0.015s
[wangbj@nuc tmp]$ time cat /tmp/i5.txt | ./rev2 > /dev/null
real 0m0.206s
user 0m0.197s
sys 0m0.015s
-rw-r--r-- 1 wangbj users 1000003 Mar 19 00:43 /tmp/i5.txt
[wangbj@nuc tmp]$ time cat /tmp/i5.txt | ./rev > /dev/null
real 0m0.637s
user 0m0.621s
sys 0m0.015s
[wangbj@nuc tmp]$ time cat /tmp/i5.txt | ./rev2 > /dev/null
real 0m0.206s
user 0m0.197s
sys 0m0.015s
While there are so much big tech asking algorithm questions (problem is 90% of exactly the same questions can be found online with answers..), reverse a string might not a laughable problem at all!
Thursday, March 10, 2016
Binary tree right side view with Haskell
data BTree a = Leaf | Branch a !(BTree a) !(BTree a) deriving Show
expand Leaf = return (mempty, [])
expand (Branch a l r) = return (Alt (Just a), [r, l])
rightSideView = mapM (getAlt . mconcat) . init . levels . runIdentity . unfoldTreeM_BF expand
Subscribe to:
Posts (Atom)